改进的后悔保证与强盗反馈在线平滑凸优化][Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback]_MOOC(慕课)境外开放课程

首页 → 计算机工程

改进的后悔保证与强盗反馈在线平滑凸优化 Improved Regret Guarantees for Online Smooth Convex Optimization with Bandit Feedback


课程网址:	http://videolectures.net/aistats2011_saha_guarantees/
主讲教师:	Ankan Saha
开课单位:	芝加哥大学
开课时间:	信息不详。欢迎您在右侧留言补充。
课程语种:	英语
中文简介:	Kleinberg（2004）和Flaxman等人发起了Bandit设置中的在线凸优化研究。（2005）。这样的设置模拟了一个决策者，在面对对手选择的凸损失函数时必须做出决策。此外，决策者收到的唯一信息是损失。损失函数本身的恒等式没有被揭示。在这种情况下，我们减小了光滑凸函数类（即具有Lipschitz连续梯度的凸函数）的最著名的上界和下界之间的间隙。在现有的自协和正则化器和单点梯度估计的基础上，给出了第一种算法，其期望遗憾为O（t2=3），忽略了常数和对数因子。
课程简介:	The study of online convex optimization in the bandit setting was initiated by Kleinberg (2004) and Flaxman et al. (2005). Such a setting models a decision maker that has to make decisions in the face of adversarially chosen convex loss functions. Moreover, the only information the decision maker receives are the losses. The identities of the loss functions themselves are not revealed. In this setting, we reduce the gap between the best known lower and upper bounds for the class of smooth convex functions, i.e. convex functions with a Lipschitz continuous gradient. Building upon existing work on selfconcordant regularizers and one-point gradient estimation, we give the rst algorithm whose expected regret is O(T2=3), ignoring constant and logarithmic factors.
关键词:	土匪设置; 在线凸优化问题; 梯度估计
课程来源:	视频讲座网
最后编审:	2019-11-16：cwx
阅读次数:	21

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆306室
系统研发：图书馆321室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2024 © 浙大宁波理工学院图书馆