自动发现和MAXQ层次转移Automatic Discovery and Transfer of MAXQ Hierarchies |
|
课程网址: | http://videolectures.net/icml08_mehta_adt/ |
主讲教师: | Neville Mehta |
开课单位: | 俄勒冈州立大学 |
开课时间: | 信息不详。欢迎您在右侧留言补充。 |
课程语种: | 英语 |
中文简介: | 我们提出了一种算法hi-mat(通过模型和轨迹进行层次归纳),通过将动态贝叶斯网络模型应用于源增强学习任务的成功轨迹,来发现MAXQ任务层次。Hi-Mat通过分析轨迹中动作之间的因果关系和时间关系发现子任务。在适当的假设下,Hi-Mat归纳出与观察到的轨迹一致的层次结构,并且具有使用安全状态抽象的紧凑的值函数表。我们从经验上证明,hi-mat构造的紧凑层次结构与手工设计的层次结构相当,并在转移到目标任务时促进学习的显著加速。 |
课程简介: | We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task. |
关 键 词: | 计算机科学; 强化学习; 贝叶斯网络模型 |
课程来源: | 视频讲座网 |
最后编审: | 2019-11-16:cwx |
阅读次数: | 38 |