0


使用基于样本的搜索进行高效的贝叶斯自适应强化学习

Efficient Bayes-Adaptive Reinforcement Learning using Sample-Based Search
课程网址: http://videolectures.net/onlinelearning2012_guez_reinforcement_le...  
主讲教师: Arthur Guez
开课单位: 伦敦大学学院
开课时间: 2013-05-28
课程语种: 英语
中文简介:
基于贝叶斯模型的强化学习是一种正式优雅的学习方法。在模型不确定性下的行为,在理想的情况下进行勘探和开发办法。 不幸的是,找到最终的贝叶斯最优政策是非常繁重的,因为搜索空间变得巨大。 在本次演讲中,我们介绍了一种易处理的,基于采样的方法近似贝叶斯 - 优化规划,利用蒙特卡罗树搜索。 我们的方法优于先前基于贝叶斯模型的RL算法在几个方面表现出色.众所周知的基准问题 - 因为它避免了贝叶斯规则的昂贵应用,通过从当前信念中懒惰地抽样模型来搜索树。我们说明了这些优点通过显示它在无限状态空间域中工作,我们的方法是定性的贝叶斯勘探中几乎所有先前工作的范围。
课程简介: Bayesian model-based reinforcement learning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, finding the resulting Bayes-optimal policies is notoriously taxing, since the search space becomes enormous. In this talk we introduce a tractable, sampled-based method for approximate Bayes-opti mal planning which exploits Monte-Carlo tree search. Our approach outperformed prior Bayesian model-based RL algorithms by a significant margin on several well-known benchmark problems-- because it avoids expensive applications of Bayes rule within the search tree by lazily sampling models from the current beliefs. We illustrate the advantages of our approach by showing it working in an infinite state space domain which is qualitatively out of reach of almost all previous work in Bayesian exploration.
关 键 词: 贝叶斯模型; 强化学习; 优化规划
课程来源: 视频讲座网
最后编审: 2019-09-09:cjy
阅读次数: 38