解构强化学习Deconstructing Reinforcement Learning |
|
课程网址: | http://videolectures.net/icml09_sutton_itdrl/ |
主讲教师: | Richard S.Sutton |
开课单位: | 阿尔伯塔大学 |
开课时间: | 2009-08-26 |
课程语种: | 英语 |
中文简介: | 本次研讨会的前提是强化学习的思想已经影响了许多领域,包括人工智能、神经科学、控制理论、心理学和经济学。但这些想法是什么,哪一个才是关键呢?是将奖励和奖励预测作为构建自然和人工系统所面临问题的一种方式吗?时差学习是近似动态规划的基于样本的算法吗?还是因为在线学习的理念,通过尝试和错误,寻找一种人类主管可能不知道的行为方式?还是所有这些想法和其他想法,都重新凸显和意义,因为这些领域关注的是动物、机器和社会面临的共同问题——如何预测和控制一个永远不能被完全理解,而只能作为一个粗略的、不断变化的近似的极其复杂的世界?在这次演讲中,我试图开始措辞和回答这些问题的过程。在某些情况下,根据我自己的经验,我可以确定哪些想法是最重要的,并猜测哪些将是未来最重要的。对于其他人,我只能要求其他演讲者和与会者从他们自己的领域提供有见地的观点。 |
课程简介: | The premise of this symposium is that the ideas of reinforcement learning have impacted many fields, including artificial intelligence, neuroscience, control theory, psychology, and economics. But what are these ideas and which of them is key? Is it the idea of reward and reward prediction as a way of structuring the problem facing both natural and artificial systems? Is it temporal-difference learning as a sample-based algorithm for approximating dynamic programming? Or is it the idea of learning online, by trial and error, searching to find a way of behaving that might not be known by any human supervisor? Or is it all of these ideas and others, all coming to renewed prominence and significance as these fields focus on the common problem that faces animals, machines, and societies - how to predict and control a hugely complex world that can never be understood incompletely, but only as a gross, ever-changing approximation? In this talk I seek to start the process of phrasing and answering these questions. In some cases, from my own experience, I can identify which ideas have been the most important, and guess which will be in the future. For others I can only ask the other speakers and attendees to provide informed perspectives from their own fields. |
关 键 词: | 强化学习; 近似动态规划; 时差学习 |
课程来源: | 视频讲座网 |
数据采集: | 2022-11-22:chenjy |
最后编审: | 2022-11-22:chenjy |
阅读次数: | 38 |