
Toward the understanding of partial-monitoring games
课程网址: http://videolectures.net/explorationexploitation2011_szepesvari_t...  
主讲教师: Csaba Szepesvári
开课单位: 阿尔伯塔大学
开课时间: 2011-07-25
课程语种: 英语
课程简介: Partial monitoring games form a common ground for problems such as learning with expert advice, the multi-armed bandit problem, dynamic pricing, the dark pool problem, label efficient prediction, channel allocation in cognitive radio, linear and convex optimization with various feedbacks. How hard is to learn to play well in a partial monitoring game can be characterized by the minimax growth rate of the learner’s regret. It is well known that some of these games are harder, while some others are easier when it comes to learning, depending on how much information one receives at what cost: Some actions in a game might give more information for the learner but might be pricier, while some might give less selective (or no), but be cheaper. When designing a strategy, it should be clear that one should take into account the global information-cost structure of the game. The question asked in this talk is how? What are the good learning strategies? That is, given the structure of a game, what is the corresponding minimax regret and what algorithm achieves it? In this talk I will review recent progress toward answering this question, as well as some open problems.
关 键 词: 监控游戏; 多臂强盗问题; 信道分配
课程来源: 视频讲座网
