首页数论
   首页数学
   首页自然科学
0


两人迭代对策中确定性策略的求解

Evaluating Deterministic Policies in Two-player Iterated Games
课程网址: http://videolectures.net/eccs07_dilao_edp/  
主讲教师: Rui Dilão
开课单位: 里斯本大学
开课时间: 2007-11-29
课程语种: 英语
中文简介:

我们构建了一个游戏的统计集成,在每个独立的子集成中,我们有两个玩家在玩同一个游戏。我们推导出游戏中代表性玩家每次移动的平均收益,并使用有限内存评估所有确定性策略。特别地,我们表明,如果其中一个玩家具有广义的针锋相对的策略,则两个玩家的平均每次移动收益相同,迫使两个玩家的平均每次移动收益相等。在对称、非合作和两难博弈的情况下,我们证明了普遍的针锋相对或模仿政策以及不第一个背叛的条件,导致玩家每次移动的平均收益最高。在这种方法中,可以决定哪些策略比其他策略表现得更好。分析了囚徒困境和鹰鸽博弈,确定了无限迭代博弈的均衡状态。只有当玩家具有确定性策略时,无限迭代的囚徒困境博弈才能有 Nash 解。

课程简介: We construct a statistical ensemble of games, where in each independent subensemble we have two players playing the same game. We derive the mean payoffs per move of the representative players of the game, and we evaluate all the deterministic policies with finite memory. In particular,we show that if one of the players has a generalized tit-for-tat policy,the mean payoff per move of both players is the same, forcing the equalization of the mean payoffs per move of both players. In the case of symmetric, non-cooperative and dilemmatic games, we show that generalized tit-for-tat or imitation policies together with the condition of not being the first to defect, leads to the highest mean payoffs per move for the players. Within this approach, it can be decided which policies perform better than others.The Prisoner's Dilemma and the Hawk-Dove games have been analyzed,and the equilibrium states of the infinitely iterated games have been determined. The infinitely iterated Prisoner's Dilemma game can have Nash solutions only if players have deterministic policies.
关 键 词: 游戏; 无限迭代博弈; 策略
课程来源: 视频讲座网
数据采集: 2021-06-08:nkq
最后编审: 2021-06-08:nkq
阅读次数: 49