马尔可夫决策过程的近最佳Chernoff界][Near Optimal Chernoff Bounds for Markov Decision Processes]_MOOC(慕课)境外开放课程

   首页 → 管理学
   首页 → 数学分析
   首页 → 应用数学

马尔可夫决策过程的近最佳Chernoff界 Near Optimal Chernoff Bounds for Markov Decision Processes


课程网址:	http://videolectures.net/machine_mihai_moldovan_processes/
主讲教师:	Teodor Mihai Moldovan
开课单位:	加州大学伯克利分校
开课时间:	2013-01-14
课程语种:	英语
中文简介:	预期回报是在不确定性下决策的一个广泛使用的目标。已经提出了许多算法，例如值迭代，以对其进行优化。但是，在风险感知设置中，预期回报通常不是优化的合适目标。我们提出了一个新的风险意识规划优化目标，并表明它具有理想的理论属性。我们还绘制了与之前提出的风险意识规划目标的联系：minmax，指数效用，百分位和均值减去方差。我们的方法适用于扩展的马尔可夫决策过程：我们允许成本是随机的，只要它们是有界的。此外，我们提出了一种有效的算法来优化提出的目标。综合和现实世界的实验大规模地说明了我们的方法的有效性。
课程简介:	The expected return is a widely used objective in decision making under uncertainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw connections to previously proposed objectives for risk-aware planing: minmax, exponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efficient algorithm for optimizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale.
关键词:	预期回报; 风险意识规划; 马尔可夫决策
课程来源:	视频讲座网
最后编审:	2019-05-15：lxf
阅读次数:	77

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2025 © 浙大宁波理工学院图书馆