Hoeffding和Bernstein的策略选择竞赛][Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search]_MOOC(慕课)境外开放课程

首页 → 工程与技术科学
首页 → 计算机科学技术

Hoeffding和Bernstein的策略选择竞赛 Hoeffding and Bernstein Races for Selecting Policies in Evolutionary Direct Policy Search


课程网址:	https://videolectures.net/videos/icml09_igel_hbrs
主讲教师:	Christian Igel
开课单位:	会议
开课时间:	2009-08-26
课程语种:	英语
中文简介:	在从各种来源进行强化学习时会出现不确定性，因此有必要考虑基于几次推出的统计数据来评估行为策略。我们在CMA-ES中添加了基于Hoeffding和经验Bernstein竞争的自适应不确定性处理，这是一种为直接策略搜索提出的可变度量进化策略。不确定性处理单独调整评估政策时考虑的事件数量。性能估计保持足够准确，以便对候选策略进行足够好的排名，这反过来又足以让CMA-ES找到更好的解决方案。这提高了算法的学习速度和鲁棒性。
课程简介:	Uncertainty arises in reinforcement learning from various sources, and therefore it is necessary to consider statistics based on several roll-outs for evaluating behavioral policies. We add an adaptive uncertainty handling based on Hoeffding and empirical Bernstein races to the CMA-ES, a variable metric evolution strategy proposed for direct policy search. The uncertainty handling adjusts individually the number of episodes considered for the evaluation of a policy. The performance estimation is kept just accurate enough for a sufficiently good ranking of candidate policies, which is in turn sufficient for the CMA-ES to find better solutions. This increases the learning speed as well as the robustness of the algorithm.
关键词:	直接策略; 策略选择; 强化学习
课程来源:	视频讲座网
数据采集:	2025-04-25：liyq
最后编审:	2025-04-25：liyq
阅读次数:	168

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2026 © 浙大宁波理工学院图书馆