
Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing
课程网址: http://videolectures.net/icml2015_malek_crowdsourcing/  
主讲教师: Alan Malek
开课单位: 加州大学
开课时间: 2015-09-27
课程语种: 英语
课程简介: We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical costs of finding the optimal policy scale with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a Kullback-Leibler divergence cost function, we can reduce policy optimization to a convex optimization and solve it approximately using a stochastic subgradient algorithm. We show that the performance of the resulting policy is close to the best in the low-dimensional family. We demonstrate the efficacy of our approach by controlling the important crowdsourcing application of budget allocation in crowd labeling.
关 键 词: 大状态空间; 马尔可夫决策; 低维策略
课程来源: 视频讲座网
数据采集: 2022-12-07:chenjy
最后编审: 2022-12-07:chenjy
阅读次数: 23