
Efficient Bandit Algorithms for Online Multiclass Prediction
课程网址: http://videolectures.net/icml08_shwartz_eba/  
主讲教师: Shai Shalev-Shwartz
开课单位: 耶路撒冷希伯来大学
开课时间: 2008-08-04
课程语种: 英语
本文介绍了Banditron,Perceptron的一种变体,用于多类强盗设置。多类强盗设置模拟了广泛的实际监督学习应用,其中学习者仅接收关于真实标签的部分反馈(在多臂强盗模型的精神中被称为“强盗”反馈)(例如,在许多网络应用程序用户通常仅提供积极的“点击”反馈,其不一定完全公开真正的标签。 Banditron具有在多级分类设置中学习“强盗”的能力。仅反映算法所做预测是否正确的反馈(但不一定揭示真实标签)。我们提供(相对)错误界限,显示了Banditron如何获得良好的性能,我们的实验证明了算法的实用性。此外,本文密切关注数据线性可分的重要特例 - 在完整信息设置中已经详尽研究的问题,但在强盗设置中却是新颖的。
课程简介: This paper introduces the Banditron, a variant of the Perceptron, for the multiclass bandit setting. The multiclass bandit setting models a wide range of practical supervised learning applications where the learner only receives partial feedback (referred to as "bandit" feedback, in the spirit of multi-armed bandit models) with respect to the true label (e.g. in many web applications users often only provide positive "click" feedback which does not necessarily fully disclose a true label). The Banditron has the ability to learn in a multiclass classification setting with the "bandit" feedback which only reveals whether or not the prediction made by the algorithm was correct or not (but does not necessarily reveal the true label). We provide (relative) mistake bounds which show how the Banditron enjoys favorable performance, and our experiments demonstrate the practicality of the algorithm. Furthermore, this paper pays close attention to the important special case when the data is linearly separable --- a problem which has been exhaustively studied in the full information setting yet is novel in the bandit setting.
关 键 词: 多类强盗设置模型; 良好性能; 实用性
课程来源: 视频讲座网
最后编审: 2020-05-21:王淑红(课程编辑志愿者)
阅读次数: 126