0


Quantification and Semi-supervised Classification Methods for Handling Changes in Class Distribution[类分布变化的量化和半监督分类方法

Quantification and Semi-supervised Classification Methods for Handling Changes in Class Distribution[类分布变化的量化和半监督分类方法
课程网址: http://videolectures.net/kdd09_weiss_qsscmhccd/  
主讲教师: Gary M. Weiss
开课单位: 福特汉姆大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
在实际设置中,在引入分类器之后,类的普遍性可能会发生变化,这会降低分类器的性能。使这种情况更加复杂的是标记数据通常是稀缺且昂贵的。在本文中,我们解决了类分布发生变化的问题,并且只有新分发中提供了未标记的示例。我们设计并评估了许多方法来解决这个问题,并比较了这些方法的性能。我们基于量化的方法估计来自变化分布的未标记数据的类分布并相应地调整原始分类器,而我们的半监督方法使用来自新(未标记)分布的示例构建新分类器,其中补充有预测的类值。我们还介绍了一种利用量化和半监督学习的混合方法。在一组基准数据集上使用准确度和F度量来评估所有方法。我们的结果表明,我们的方法可以显着提高准确度和F度量。
课程简介: In realistic settings the prevalence of a class may change after a classifier is induced and this will degrade the performance of the classifier. Further complicating this scenario is the fact that labeled data is often scarce and expensive. In this paper we address the problem where the class distribution changes and only unlabeled examples are available from the new distribution. We design and evaluate a number of methods for coping with this problem and compare the performance of these methods. Our quantification-based methods estimate the class distribution of the unlabeled data from the changed distribution and adjust the original classifier accordingly, while our semi-supervised methods build a new classifier using the examples from the new (unlabeled) distribution which are supplemented with predicted class values. We also introduce a hybrid method that utilizes both quantification and semi-supervised learning. All methods are evaluated using accuracy and F-measure on a set of benchmark data sets. Our results demonstrate that our methods yield substantial improvements in accuracy and F-measure.
关 键 词: 分类器; 标记数据; 类分布
课程来源: 视频讲座网
最后编审: 2019-05-10:lxf
阅读次数: 33