0


演示 - 使用GPTD控制章鱼臂

Demo - Control of an octopus arm using GPTD
课程网址: http://videolectures.net/icml07_engel_demo/  
主讲教师: Yaakov Engel
开课单位: 艾伯塔大学
开课时间: 信息不详。欢迎您在右侧留言补充。
课程语种: 英语
中文简介:
虽然贝叶斯强化学习方法可以追溯到20世纪60年代(霍华德在运筹学方面的工作),但贝叶斯方法只是偶尔用于现代强化学习。这在一定程度上是因为非贝叶斯方法更容易使用。然而,最近的进展表明,贝叶斯方法不需要像最初的想法那样复杂,并且提供了几个理论优势。例如,通过跟踪未知数上的完全分布(而不是点估计),贝叶斯方法允许更全面地量化有关转换概率、回报、价值函数参数和政策参数的不确定性。这种分布信息可以用来优化(原则性地)经典的勘探开发权衡,从而加快学习过程。同样,主动学习可以自然地优化强化学习。在使用较少的数据的情况下,也可以更准确地估计与值函数或/或策略参数有关的梯度性能。贝叶斯方法还促进了先验知识的编码和域假设的显式表达。本教程的主要目标是提高研究群体对贝叶斯方法、其特性以及增强学习的潜在益处的认识。本文首先介绍了贝叶斯学习,然后介绍了贝叶斯强化学习的历史沿革和现有的贝叶斯强化学习方法。贝叶斯强化学习技术的特点和优点将通过案例研究加以讨论、分析和说明。
课程简介: Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have only been used sporadically in modern Reinforcement Learning. This is in part because non-Bayesian approaches tend to be much simpler to work with. However, recent advances have shown that Bayesian approaches do not need to be as complex as initially thought and offer several theoretical advantages. For instance, by keeping track of full distributions (instead of point estimates) over the unknowns, Bayesian approaches permit a more comprehensive quantification of the uncertainty regarding the transition probabilities, the rewards, the value function parameters and the policy parameters. Such distributional information can be used to optimize (in a principled way) the classic exploration/exploitation tradeoff, which can speed up the learning process. Similarly, active learning for reinforcement learning can be naturally optimized. The estimation of gradient performance with respect to value function or and/or policy parameters can also be done more accurately while using less data. Bayesian approaches also facilitate the encoding of prior knowledge and the explicit formulation of domain assumptions. The primary goal of this tutorial is to raise the awareness of the research community with regard to Bayesian methods, their properties and potential benefits for the advancement of Reinforcement Learning. An introduction to Bayesian learning will be given, followed by a historical account of Bayesian Reinforcement Learning and a description of existing Bayesian methods for Reinforcement Learning. The properties and benefits of Bayesian techniques for Reinforcement Learning will be discussed, analyzed and illustrated with case studies.
关 键 词: 贝叶斯方法; 编码; 机器学习; 贝叶斯学习
课程来源: 视频讲座网
最后编审: 2019-11-28:cwx
阅读次数: 39