
Influence-based Policy Abstraction for Weakly-coupled DEC-POMDPs
课程网址: http://videolectures.net/icaps2010_witwicki_ibpa/  
主讲教师: Stefan Witwicki
开课单位: 密西根大学
开课时间: 2010-11-15
课程语种: 英语
分散的POMDP是在不确定环境中协调代理决策的强有力的理论模型,但最优联合政策构建的一般难以解决的复杂性为将DEC POMDP应用于许多代理面临许多政策选择的问题带来了巨大障碍。在这里,我们认为,当大多数代理选择独立于其他代理选择时,可以避免这种复杂性:代理不需要协调完整的策略,而只需要协调明确传达基本交互影响的策略抽象。为此,我们开发了一个新的基于影响的策略抽象框架,用于弱耦合的依赖于过渡的DEC POMDP问题,该框架包含了几种现有的方法。除了形式化地描述与过渡相关的影响空间外,我们还提供了一种计算最优和近似最优联合策略的方法。我们提出了一个初步的经验分析,针对通常研究的过渡依赖于影响的倾向问题,这表明了基于影响的抽象相对于最先进的最优策略搜索方法的潜在计算效益。
课程简介: Decentralized POMDPs are powerful theoretical models for coordinating agents’ decisions in uncertain environments, but the generally-intractable complexity of optimal joint policy construction presents a significant obstacle in applying Dec-POMDPs to problems where many agents face many policy choices. Here, we argue that when most agent choices are independent of other agents’ choices, much of this complexity can be avoided: instead of coordinating full policies, agents need only coordinate policy abstractions that explicitly convey the essential interaction influences. To this end, we develop a novel framework for influence-based policy abstraction for weakly-coupled transition-dependent Dec-POMDP problems that subsumes several existing approaches. In addition to formally characterizing the space of transition-dependent influences, we provide a method for computing optimal and approximately-optimal joint policies. We present an initial empirical analysis, over problems with commonly-studied flavors of transition-dependent influences, that demonstrates the potential computational benefits of influence-based abstraction over state-of-the-art optimal policy search methods.
关 键 词: 政策协调; 联合策略; 实证分析; 优化策略搜索方法
课程来源: 视频讲座网
最后编审: 2020-09-28:heyf
阅读次数: 41