0


强化学习和贝叶斯学习简介

Introduction to Reinforcement Learning and Bayesian learning
课程网址: http://videolectures.net/icml07_ghavamzadeh_itrl/  
主讲教师: Mohammad Ghavamzadeh
开课单位: 阿尔伯塔大学
开课时间: 2007-06-22
课程语种: 英语
中文简介:
虽然贝叶斯强化学习方法可以追溯到20世纪60年代(霍华德在运筹学中的工作),贝叶斯方法只是在现代强化学习中偶尔使用。这部分是因为非贝叶斯方法往往更容易使用。然而,最近的进展表明,贝叶斯方法不需要像最初想象的那样复杂,并提供几个理论上的优势。例如,通过跟踪未知数的完全分布(而不是点估计),贝叶斯方法允许更全面地量化关于转移概率,奖励,价值函数参数和策略参数的不确定性。这种分布信息可用于优化(以原则方式)经典的勘探/开发权衡,这可以加速学习过程。类似地,可以自然地优化用于强化学习的主动学习。在使用较少数据的同时,还可以更精确地完成关于值函数或/和策略参数的梯度性能的估计。贝叶斯方法还有助于先验知识的编码和领域假设的明确表述。本教程的主要目标是提高研究界对贝叶斯方法,其属性以及推进强化学习的潜在益处的认识。将介绍贝叶斯学习,接着是贝叶斯强化学习的历史记录和现有贝叶斯强化学习方法的描述。将通过案例研究讨论,分析和说明贝叶斯加强学习技术的特性和益处。
课程简介: Although Bayesian methods for Reinforcement Learning can be traced back to the 1960s (Howard's work in Operations Research), Bayesian methods have only been used sporadically in modern Reinforcement Learning. This is in part because non-Bayesian approaches tend to be much simpler to work with. However, recent advances have shown that Bayesian approaches do not need to be as complex as initially thought and offer several theoretical advantages. For instance, by keeping track of full distributions (instead of point estimates) over the unknowns, Bayesian approaches permit a more comprehensive quantification of the uncertainty regarding the transition probabilities, the rewards, the value function parameters and the policy parameters. Such distributional information can be used to optimize (in a principled way) the classic exploration/exploitation tradeoff, which can speed up the learning process. Similarly, active learning for reinforcement learning can be naturally optimized. The estimation of gradient performance with respect to value function or and/or policy parameters can also be done more accurately while using less data. Bayesian approaches also facilitate the encoding of prior knowledge and the explicit formulation of domain assumptions. The primary goal of this tutorial is to raise the awareness of the research community with regard to Bayesian methods, their properties and potential benefits for the advancement of Reinforcement Learning. An introduction to Bayesian learning will be given, followed by a historical account of Bayesian Reinforcement Learning and a description of existing Bayesian methods for Reinforcement Learning. The properties and benefits of Bayesian techniques for Reinforcement Learning will be discussed, analyzed and illustrated with case studies.
关 键 词: 贝叶斯强化学习方法; 跟踪未知数; 转移概率
课程来源: 视频讲座网
最后编审: 2019-04-17:lxf
阅读次数: 162