马尔可夫决策过程的近最佳Chernoff界Near Optimal Chernoff Bounds for Markov Decision Processes |
|
课程网址: | http://videolectures.net/machine_mihai_moldovan_processes/ |
主讲教师: | Teodor Mihai Moldovan |
开课单位: | 加州大学伯克利分校 |
开课时间: | 2013-01-14 |
课程语种: | 英语 |
中文简介: | 预期回报是在不确定性下决策的一个广泛使用的目标。已经提出了许多算法,例如值迭代,以对其进行优化。但是,在风险感知设置中,预期回报通常不是优化的合适目标。我们提出了一个新的风险意识规划优化目标,并表明它具有理想的理论属性。我们还绘制了与之前提出的风险意识规划目标的联系:minmax,指数效用,百分位和均值减去方差。我们的方法适用于扩展的马尔可夫决策过程:我们允许成本是随机的,只要它们是有界的。此外,我们提出了一种有效的算法来优化提出的目标。综合和现实世界的实验大规模地说明了我们的方法的有效性。 |
课程简介: | The expected return is a widely used objective in decision making under uncertainty. Many algorithms, such as value iteration, have been proposed to optimize it. In risk-aware settings, however, the expected return is often not an appropriate objective to optimize. We propose a new optimization objective for risk-aware planning and show that it has desirable theoretical properties. We also draw connections to previously proposed objectives for risk-aware planing: minmax, exponential utility, percentile and mean minus variance. Our method applies to an extended class of Markov decision processes: we allow costs to be stochastic as long as they are bounded. Additionally, we present an efficient algorithm for optimizing the proposed objective. Synthetic and real-world experiments illustrate the effectiveness of our method, at scale. |
关 键 词: | 预期回报; 风险意识规划; 马尔可夫决策 |
课程来源: | 视频讲座网 |
最后编审: | 2019-05-15:lxf |
阅读次数: | 57 |