
Lagrange Dual Decomposition for Finite Horizon Markov Decision Processes
课程网址: http://videolectures.net/ecmlpkdd2011_furmston_lagrange/  
主讲教师: Thomas Furmston
开课单位: 伦敦大学学院
开课时间: 2011-11-30
课程语种: 英语
课程简介: Solving finite-horizon Markov Decision Processes with stationary policies is a computationally difficult problem. Our dynamic dual decomposition approach uses Lagrange duality to decouple this hard problem into a sequence of tractable sub-problems. The resulting procedure is a straightforward modification of standard non-stationary Markov Decision Process solvers and gives an upper-bound on the total expected reward. The empirical performance of the method suggests that not only is it a rapidly convergent algorithm, but that it also performs favourably compared to standard planning algorithms such as policy gradients and lower-bound procedures such as Expectation Maximisation.
关 键 词: 平稳策略; 马尔可夫决策; 双重分解
课程来源: 视频讲座网
最后编审: 2020-10-22:chenxin
阅读次数: 114