首页数学
0


分层POMDP控制器优化的可能性最大化

Hierarchical POMDP Controller Optimization by Likelihood Maximization
课程网址: http://videolectures.net/uai08_toussaint_hpco/  
主讲教师: Marc Toussaint
开课单位: 柏林工业大学
开课时间: 2008-07-30
课程语种: 英语
中文简介:
通过将任务分解为分层排列的较小任务,通常可以简化计划。查林等人。最近表明,层次结构发现问题可以被构造为非凸优化问题。然而,解决这种优化问题的固有计算难度使得难以扩展到现实世界的问题。在另一项研究中,Toussaint等人。开发了一种通过最大似然估计来解决规划问题的方法。在本文中,我们将展示如何使用类似的最大似然方法解决部分可观察域中的层次结构发现问题。我们的技术首先将问题转化为动态贝叶斯网络,通过该网络可以在优化策略的同时自然地发现层次结构。实验结果表明,该方法比基于非凸优化的先前技术更好地扩展。
课程简介: Planning can often be simplified by decomposing the task into smaller tasks arranged hierarchically. Charlin et al. recently showed that the hierarchy discovery problem can be framed as a non-convex optimization problem. However, the inherent computational difficulty of solving such an optimization problem makes it hard to scale to real world problems. In another line of research, Toussaint et al. developed a method to solve planning problems by maximum likelihood estimation. In this paper, we show how the hierarchy discovery problem in partially observable domains can be tackled using a similar maximum likelihood approach. Our technique first transforms the problem into a dynamic Bayesian network through which a hierarchical structure can naturally be discovered while optimizing the policy. Experimental results demonstrate that this approach scales better than previous techniques based on non-convex optimization.
关 键 词: 非凸优化问题; 最大似然估计; 动态贝叶斯网络
课程来源: 视频讲座网
最后编审: 2020-06-06:张荧(课程编辑志愿者)
阅读次数: 48