不可知的KWIK学习和有效的近似强化学习][Agnostic KWIK learning and efficient approximate reinforcement learning]_MOOC(慕课)境外开放课程

   首页 → 计算机科学技术基础学科
   首页 → 数学
   首页 → 自然科学

不可知的KWIK学习和有效的近似强化学习 Agnostic KWIK learning and efficient approximate reinforcement learning


课程网址:	http://videolectures.net/colt2011_szepesvari_agnostic/
主讲教师:	Csaba Szepesvári
开课单位:	阿尔伯塔大学
开课时间:	2011-08-02
课程语种:	英语
中文简介:	强化学习的一种常用方法是使用基于模型的算法，即，一种利用模型学习器学习环境近似模型的算法。已经证明，如果模型学习者在所谓的\know what It knows (KWIK)框架中是高效的，那么这种基于模型的学习者就是高效的。标准KWIK框架的一个主要限制是，根据它的定义，它只涵盖了(模型)学习者能够表示没有错误的实际环境的情况。在本文中，我们引入了不可知论的KWIK学习模型，通过允许非零近似误差来放松这一假设。我们证明了一个新的定义，一个有效的模型学习者仍然导致一个有效的强化学习算法。然而，与此同时，我们发现，与标准框架相比，即使是在简单的学习问题中，在新框架中学习的速度也会明显变慢。
课程简介:	A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the environment. It has been shown such a model-based learner is efficient if the model learner is efficient in the so-called \knows what it knows" (KWIK) framework. A major limitation of the standard KWIK framework is that, by its very definition, it covers only the case when the (model) learner can represent the actual environment with no errors. In this paper, we introduce the agnostic KWIK learning model, where we relax this assumption by allowing nonzero approximation errors. We show that with the new definition that an efficient model learner still leads to an effcient reinforcement learning algorithm. At the same time, though, we find that learning within the new framework can be substantially slower as compared to the standard framework, even in the case of simple learning problems.
关键词:	强化学习; 模型学习; KWIK框架; 学习算法; 学习环境
课程来源:	视频讲座网公开课
最后编审:	2019-05-26：cwx
阅读次数:	97

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2025 © 浙大宁波理工学院图书馆