0


非监督转移分类:应用到文本分类

Unsupervised Transfer Classification: Application to Text Categorization
课程网址: http://videolectures.net/kdd2010_yang_utlatc/  
主讲教师: Tianbao Yang
开课单位: 爱荷华大学
开课时间: 2010-10-01
课程语种: 英语
中文简介:
我们研究了在没有任何标记的训练示例的情况下建立目标类的分类模型的问题。为了解决这个困难的学习问题, 我们扩展了转移学习的想法, 假设以下方面的信息是可用的: (i) 属于问题域中其他类的标记示例的集合, 称为辅助类;(ii) 类信息, 包括目标类的先验信息以及目标类与辅助类之间的相关性。我们的目标是利用上述数据和信息构建目标类的分类模型。我们将这个学习问题称为无监督转移分类。我们的框架基于广义最大熵模型, 该模型有效地将辅助类的标签信息传递到目标类。理论分析表明, 在一定的假设下, 该方法得到的分类模型从目标类的标记示例中学习时, 收敛到最优模型。对四种不同数据集的文本分类进行了实证研究, 验证了该方法的有效性。
课程简介: We study the problem of building the classification model for a target class in the absence of any labeled training example for that class. To address this difficult learning problem, we extend the idea of transfer learning by assuming that the following side information is available: (i) a collection of labeled examples belonging to other classes in the problem domain, called the auxiliary classes; (ii) the class information including the prior of the target class and the correlation between the target class and the auxiliary classes. Our goal is to construct the classification model for the target class by leveraging the above data and information. We refer to this learning problem as unsupervised transfer classification. Our framework is based on the generalized maximum entropy model that is effective in transferring the label information of the auxiliary classes to the target class. A theoretical analysis shows that under certain assumption, the classification model obtained by the proposed approach converges to the optimal model when it is learned from the labeled examples for the target class. Empirical study on text categorization over four different data sets verifies the effectiveness of the proposed approach.
关 键 词: 计算机科学; 模式识别; 分级模型
课程来源: 视频讲座网
最后编审: 2020-06-13:邬启凡(课程编辑志愿者)
阅读次数: 39