0


具有Kullback-Leibler分歧的多武装匪徒问题的有限时间分析

A Finite-Time Analysis of Multi-armed Bandits Problems with Kullback-Leibler Divergences
课程网址: http://videolectures.net/colt2011_maillard_analysis/  
主讲教师: Odalric-Ambrym Maillard
开课单位: INRIA研究机构
开课时间: 2011-08-02
课程语种: 英语
中文简介:
我们考虑基于Kullback-Leibler的算法用于随机多臂强盗问题,在具有有限支撑的分布情况下(事先不一定已知),其渐近后悔与Burnetas和Katehakis(1996)的下界匹配。 我们的贡献是提供该算法的有限时间分析; 我们得到的边界的主要术语小于先前已知的有限时间分析算法(如UCB类型算法)。
课程简介: We consider a Kullback-Leibler-based algorithm for the stochastic multi-armed bandit problem in the case of distributions with finite supports (not necessarily known beforehand), whose asymptotic regret matches the lower bound of Burnetas and Katehakis (1996). Our contribution is to provide a finite-time analysis of this algorithm; we get bounds whose main terms are smaller than the ones of previously known algorithms with finite-time analyses (like UCB-type algorithms).
关 键 词: 优化方法; 新型算法; 有限时间分析算法; 算法计算
课程来源: 视频讲座网
最后编审: 2020-07-29:yumf
阅读次数: 101