0


稀有类分析的局部分解

Local Decomposition for Rare Class Analysis
课程网址: http://videolectures.net/kdd07_wu_ldrc/  
主讲教师: Junjie Wu
开课单位: 俄勒冈州立大学
开课时间: 2007-08-14
课程语种: 英语
中文简介:
鉴于其重要性,预测大规模多标记数据集中稀有类的问题引起了文献的极大关注。然而,罕见的类问题仍然是一个关键挑战,因为没有为处理不平衡类分布而开发的自然方法。因此,本文通过开发一种使用lOcal clusterinG(COG)进行分类的方法来填补这一关键空白。具体来说,对于具有不平衡类分布的数据集,我们在每个大类中执行聚类并生成具有相对平衡大小的子类。然后,我们应用传统的监督学习算法,如支持向量机(SVM),进行分类。实际上,我们对各种现实世界数据集的实验结果表明,我们的方法在稀有类别上产生的显着预测精度高于现有技术方法。此外,我们表明COG还可以改善传统监督学习算法在具有平衡类分布的数据集上的性能。
课程简介: Given its importance, the problem of predicting rare classes in large-scale multi-labeled data sets has attracted great attentions in the literature. However, the rare-class problem remains a critical challenge, because there is no natural way developed for handling imbalanced class distributions. This paper thus fills this crucial void by developing a method for Classification using lOcal clusterinG (COG). Specifically, for a data set with an imbalanced class distribution, we perform clustering within each large class and produce sub-classes with relatively balanced sizes. Then, we apply traditional supervised learning algorithms, such as Support Vector Machines (SVMs), for classification. Indeed, our experimental results on various real-world data sets show that our method produces significantly higher prediction accuracies on rare classes than state-of-the-art methods. Furthermore, we show that COG can also improve the performance of traditional supervised learning algorithms on data sets with balanced class distributions.
关 键 词: 多标记数据集; 不平衡类分布; 平衡类分布
课程来源: 视频讲座网
最后编审: 2019-05-09:lxf
阅读次数: 50