0


缓解部分标签学习的班级失衡问题

Towards mitigating the class-imbalance problem for partial label learning
课程网址: http://videolectures.net/kdd2018_wang_towards_mitigating/  
主讲教师: Jing Wang
开课单位: 东南大学
开课时间: 2018-11-23
课程语种: 英语
中文简介:
部分标签(PL)学习旨在从训练示例中诱导多类分类器,其中每个样本与一组候选标签相关联,其中只有一个是有效的。众所周知,类不平衡问题是影响多类分类器泛化性能的一个主要因素,并且随着学习方法无法直接访问每个PL训练示例的基本真值标签,这个问题变得更加突出。为了减轻类不平衡对部分标签学习的负面影响,通过采用过采样技术来处理PL训练示例,提出了一种新的类不平衡感知方法Cimap。首先,对于每个PL训练示例,Cimap通过估计每个类标签的置信度来消除其候选标签集的歧义,该置信度是通过加权k最近邻聚集的基础真值。之后,通过对消歧结果的操作对现有PL训练示例进行过采样,补充原始PL训练集以用于模型归纳。对人工和真实世界PL数据集的大量实验表明,Cimap是一种有效的数据级方法,可以缓解部分标签学习的类不平衡问题。
课程简介: Partial label (PL) learning aims to induce a multi-class classifier from training examples where each of them is associated with a set of candidate labels, among which only one is valid. It is well-known that the problem of class-imbalance stands as a major factor affecting the generalization performance of multi-class classifier, and this problem becomes more pronounced as the ground-truth label of each PL training example is not directly accessible to the learning approach. To mitigate the negative influence of class-imbalance to partial label learning, a novel class-imbalance aware approach named Cimap is proposed by adapting over-sampling techniques for handling PL training examples. Firstly, for each PL training example, Cimap disambiguates its candidate label set by estimating the confidence of each class label being ground-truth one via weighted k-nearest neighbor aggregation. After that, the original PL training set is replenished for model induction by over-sampling existing PL training examples via manipulation of the disambiguation results. Extensive experiments on artificial as well as real-world PL data sets show that Cimap serves as an effective data-level approach to mitigate the class-imbalance problem for partial label learning.
关 键 词: 诱导多类分类器; PL训练示例; 基本真值标签; 采样技术来处理PL训练示例
课程来源: 视频讲座网
数据采集: 2023-01-30:cyh
最后编审: 2023-01-30:cyh
阅读次数: 31