0


一种随机最优控制的路径积分方法

A path integral approach to stochastic optimal control
课程网址: http://videolectures.net/oiml05_kappen_piaso/  
主讲教师: Bert Kappen
开课单位: 奈梅亨拉德布德大学
开课时间: 2007-02-25
课程语种: 英语
中文简介:
机器学习中的许多问题都使用概率描述。例如模式识别方法和图形模型。由于这种统一描述,我们可以应用一般的近似方法,如平均场理论和抽样方法。另一类重要的机器学习问题是强化学习问题,即最优控制问题。这里,也使用了概率描述,但到目前为止还没有得到有效的平均场近似。在本文中,我考虑了任意动力系统的线性二次控制,并证明了对于这类随机控制问题,非线性哈密顿-雅可比-贝尔曼方程可以转化为线性方程。转换类似于将薛定谔方程与哈密尔顿-雅可比形式主义联系起来的转换。通过一个可以用随机积分计算或用路径积分描述的正扩散过程,可以有效地进行计算。对于这一路径积分,可以期望得到一个变分平均场近似。
课程简介: Many problems in machine learning use a probabilistic description. Examples are pattern recognition methods and graphical models. As a consequence of this uniform description, one can apply generic approximation methods such as mean field theory and sampling methods. Another important class of machine learning problems are the reinforcement learning problems, aka optimal control problems. Here, also a probabilistic description is used, but up to now efficient mean field approximations have not been obtained. In this presentation, I consider linear-quadratic control of an arbitrary dynamical system and show, that for this class of stochastic control problems the non-linear Hamilton-Jacobi-Bellman equation can be transformed into a linear equation. The transformation is similar to the transformation used to relate the Schrödinger equation to the Hamilton-Jacobi formalism. The computation can be performed efficiently by means of a forward diffusion process that can be computed by stochastic integration or that can be described by a path integral. For this path integral it is expected that a variational mean field approximation could be derived.
关 键 词: 机器学习; 概率描述; 最优控制问题; 线性方程; 路径积分
课程来源: 视频讲座网
最后编审: 2020-06-02:张荧(课程编辑志愿者)
阅读次数: 216