0


探索编码的多类文本分类的层次分类矩阵空间

Exploring the space of coding matrix classifiers for hierarchical multiclass text categorization
课程网址: http://videolectures.net/sikdd2011_brank_hierarchical/  
主讲教师: Janez Brank
开课单位: 约瑟夫·斯特凡学院
开课时间: 2011-11-04
课程语种: 英语
中文简介:
解决多类分类问题的一种方法是将其转化为若干两类(二进制)分类问题。针对这些任务训练了一组二元分类器,并将它们的预测用投票法结合到原始多类问题的预测中。每个新的二进制问题都使用一些原始类作为正训练数据,一些类作为负训练数据,其余的类(如果有的话)根本不使用。类(原始问题的)和二进制分类器之间的关系可以用一个称为编码矩阵的矩阵简洁地表示。本文在一个小层次多类学习问题的背景下,探讨了基于编码矩阵的分类器空间的一些统计性质。
课程简介: One of the ways of approaching a multiclass classification problem is to transform it into several two-class (binary) classification problems. An ensemble of binary classifiers is trained for these tasks and their predictions are combined using a voting method into predictions for the original multiclass problem. Each of the new binary problems uses some of the original classes as positive training data, some classes as negative training data and the remaining classes (if any) are not used at all. The relationship between classes (of the original problem) and binary classifiers can be concisely represented by a matrix called the coding matrix. In this paper we explore some of the statistical properties of the space of coding matrix based classifiers in the context of a small hierarchical multiclass learning problem.
关 键 词: 机器学习; 集成方法; 计算机科学; 文本挖掘; 分类空间
课程来源: 视频讲座网
最后编审: 2020-07-28:yumf
阅读次数: 36