0


具有KL控制成本的大规模Markov决策问题及其在众包中的应用

Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
课程网址: http://videolectures.net/icml2015_malek_crowdsourcing/  
主讲教师: Alan Malek
开课单位: 加州大学
开课时间: 2015-09-27
课程语种: 英语
中文简介:
我们研究了具有大状态空间的平均和总成本马尔可夫决策问题。由于随着状态空间的大小找到最优策略规模的计算和统计成本,我们专注于在低维策略族中搜索近似最优。特别地,我们证明了对于具有Kullback-Leibler发散成本函数的问题,我们可以将策略优化简化为凸优化,并使用随机次梯度算法近似地解决它。我们表明,在低维家庭中,由此产生的策略的性能接近最佳。我们通过控制预算分配在人群标签中的重要众包应用,证明了我们方法的有效性。
课程简介: We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.
关 键 词: 大状态空间; 马尔可夫决策; 低维策略
课程来源: 视频讲座网
数据采集: 2022-12-07:chenjy
最后编审: 2022-12-07:chenjy
阅读次数: 29