无模型强化学习作为混合学习][Model-Free Reinforcement Learning as Mixture Learning]_MOOC(慕课)境外开放课程

   首页 → 数学分析
   首页 → 概率论
   首页 → 应用数学

无模型强化学习作为混合学习 Model-Free Reinforcement Learning as Mixture Learning


课程网址:	http://videolectures.net/icml09_vlassis_mfr/
主讲教师:	Nikos Vlassis
开课单位:	克里特理工大学
开课时间:	2009-08-26
课程语种:	英语
中文简介:	我们将模型自由强化学习作为通过抽样最大化概率混合模型的可能性的问题，解决了现有和夜间情况。我们描述了用于似然最大化的随机逼近算法，在表格的情况下，它等效于非自举乐观政策迭代算法，如Sarsa（1），可以应用于MDP和POMDP。在理论方面，通过将提出的随机EM算法与乐观策略迭代算法族相关联，我们提供了允许设计和分析该族中算法的新工具。在实际方面，关于POMDP问题的初步实验表明了令人鼓舞的结果。
课程简介:	We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the innite and nite horizon cases. We describe a Stochastic Approximation EM algorithm for likelihood maximization that, in the tabular case, is equivalent to a non-bootstrapping optimistic policy iteration algorithm like Sarsa(1) that can be applied both in MDPs and POMDPs. On the theoretical side, by relating the proposed stochastic EM algorithm to the family of optimistic policy iteration algorithms, we provide new tools that permit the design and analysis of algorithms in that family. On the practical side, preliminary experiments on a POMDP problem demonstrated encouraging results.
关键词:	自由强化学习; 抽样; 最大化概率
课程来源:	视频讲座网
最后编审:	2021-04-09：yumf
阅读次数:	139

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2025 © 浙大宁波理工学院图书馆