0


多类学习能力与经验风险最小化

Multiclass Learnability and the ERM principle
课程网址: http://videolectures.net/colt2011_daniely_principle/  
主讲教师: Amit Daniely
开课单位: 希伯来大学
开课时间: 2011-08-02
课程语种: 英语
中文简介:
多类学习是一个日益实用的领域,目前可用的理论仍然远未提供令人满意的理解。我们研究多类预测的可学习性,并推导出不同学习模型中多类假设类的样本复杂性的上下界:批量/在线,可实现/不可实现,完整信息/强盗反馈。我们的分析揭示了一个令人惊讶的现象:在多类设置中,与二元分类形成鲜明对比,并非所有经验风险最小化(ERM)算法都同样成功。我们表明存在假设类,一些ERM学习者的样本复杂度低于其他ERM学习者。此外,有些课程可供一些ERM学习者学习,而其他ERM学习者则无法学习它们。我们提出了设计优秀ERMlearner的原则,并使用该原理来证明学习对称多类假设类的样本复杂性的严格界限(即,在标签名称的任何排列下都是不变的类)。我们通过分析两个广泛使用的假设类的样本复杂性来证明该理论的相关性:广义线性多类模型和约简树。我们还得到了一些实际相关的结论。
课程简介: Multiclass learning is an area of growing practical relevance, for which the currently available theory is still far from providing satisfactory understanding. We study the learnability of multiclass prediction, and derive upper and lower bounds on the sample complexity of multiclass hypothesis classes in different learning models: batch/online, realizable/ unrealizable, full information/bandit feedback. Our analysis reveals a surprising phenomenon: In the multiclass setting, in sharp contrast to binary classification, not all Empirical Risk Minimization (ERM) algorithms are equally successful. We show that there exist hypotheses classes for which some ERM learners have lower sample complexity than others. Furthermore, there are classes that are learnable by some ERM learners, while other ERM learner will fail to learn them. We propose a principle for designing good ERM learners, and use this principle to prove tight bounds on the sample complexity of learning symmetric multiclass hypothesis classes (that is, classes that are invariant under any permutation of label names). We demonstrate the relevance of the theory by analyzing the sample complexity of two widely used hypothesis classes: generalized linear multiclass models and reduction trees. We also obtain some practically relevant conclusions.
关 键 词: 多类学习; 学习模型; 二元分类
课程来源: 视频讲座网
最后编审: 2019-03-05:lxf
阅读次数: 58