0


部分信息反馈下的在线学习中的权衡

Tradeoffs in online learning under partial information feedback
课程网址: http://videolectures.net/nipsworkshops2012_szepesvari_feedback/  
主讲教师: Csaba Szepesvári
开课单位: 阿尔伯塔大学
开课时间: 2013-01-16
课程语种: 英语
中文简介:
在线学习者应该如何在探索和开发之间选择自己的行为来权衡,以最大限度地提高预测的准确性,而选择的行为直接影响学习者接收到的信息?首先,使用部分监控的抽象框架,我们为任何离散预测问题提供了一个完整的答案:事实证明,最佳权衡的困难取决于一个新颖而直观的几何代数条件。我们还讨论了关于适应良性环境的权衡和开放性问题、附带信息的预测、学习者需要为访问特征值和标签付费的特定问题以及延迟接收反馈的影响。
课程简介: How should an online learner choose its actions to trade off between exploration and exploitation to maximize the accuracy of predictions where the choice of actions directly influence what information the learner receives? First, using the abstract framework of partial monitoring, we provide a full answer to this question for any discrete prediction problems: As it turns out, the difficulty at the optimal tradeoff depends on a novel, yet intuitive geometric-algebraic condition. We also discuss tradeoffs and open problems concerning adaptation to benign environments, predictions with side-information, a specific problem when the learner needs to pay for accessing the feature values and the label, and the influence of delays in receiving the feedback.
关 键 词: 在线学习; 抽象框架; 部分监测; 离散预测; 权衡; 几何代数
课程来源: 视频讲座网
最后编审: 2020-06-02:张荧(课程编辑志愿者)
阅读次数: 38