0


重复随机博弈的多项式时间纳什均衡算法

A Polynomial-time Nash Equilibrium Algorithm for Repeated Stochastic Games
课程网址: http://videolectures.net/uai08_munoz_aptnea/  
主讲教师: Enrique Munoz de Cote
开课单位: 米兰理工学院
开课时间: 2008-07-30
课程语种: 英语
中文简介:
我们提出了一个多项式时间算法,它总是为重复的两人随机博弈找到一个(近似的)纳什均衡。该算法利用福克定理,在可能的情况下,通过支持互惠行为和威胁,得出一个策略配置文件,从而形成一个平衡。我们算法的一个组成部分有效地搜索平等点的近似值,即最公平的帕累托有效解。最后将该算法应用到一组网格博弈中,说明了该算法的典型解。这些解决方案与竞争算法所发现的方案相比非常有利,从而产生具有更高社会福利和保证计算效率的策略。
课程简介: We present a polynomial-time algorithm that always finds an (approximate) Nash equilibrium for repeated two-player stochastic games. The algorithm exploits the folk theorem to derive a strategy profile that forms an equilibrium by buttressing mutually beneficial behavior with threats, where possible. One component of our algorithm efficiently searches for an approximation of the egalitarian point, the fairest pareto-efficient solution. The paper concludes by applying the algorithm to a set of grid games to illustrate typical solutions the algorithm finds. These solutions compare very favorably to those found by competing algorithms, resulting in strategies with higher social welfare, as well as guaranteed computational efficiency.
关 键 词: 数学; 博弈论; 人工智能 ; 计算机科学; 机器学习; 马尔可夫过程
课程来源: 视频讲座网
最后编审: 2020-09-28:heyf
阅读次数: 88