考虑到看不见的状态因素的强化学习是不可能的Considering Unseen States as Impossible in Factored Reinforcement Learning |
|
课程网址: | http://videolectures.net/ecmlpkdd09_kozlova_cusifr/ |
主讲教师: | Olga Kozlova |
开课单位: | 皮埃尔与玛丽居里大学 |
开课时间: | 2009-10-20 |
课程语种: | 英语 |
中文简介: | 因子马尔可夫决策过程(fmdp)框架是不确定条件下连续决策问题的标准表示,状态表示为一组随机变量。因子强化学习(frl)是一种基于模型的FMDPS强化学习方法,在该方法中学习问题的过渡和奖励功能。在本文中,我们展示了如何以一种理论上有根据的方式来建模一些可能不会发生状态变量值组合的问题,从而导致不可能的状态。此外,我们提出了一种新的启发式方法,认为目前还没有看到的状态是不可能的。我们推导了一种算法,通过基准实验说明了该算法在性能上相对于标准方法的改进。 |
课程简介: | The Factored Markov Decision Process (FMDP) framework is a standard representation for sequential decision problems under uncertainty where the state is represented as a collection of random variables. Factored Reinforcement Learning (FRL) is an Model-based Reinforcement Learning approach to FMDPs where the transition and reward unctions of the problem are learned. In this paper, we show how to model in a theoretically well-founded way the problems where some combinations of state variable values may not occur, giving rise to impossible states. Furthermore, we propose a new heuristics that considers as impossible the states that have not been seen so far. We derive an algorithm whose improvement in performance with respect to the standard approach is illustrated through benchmark experiments. |
关 键 词: | 计算机科学; 马尔可夫决策; 强化学习 |
课程来源: | 视频讲座网 |
最后编审: | 2019-12-04:lxf |
阅读次数: | 70 |