0


对于多类子群发现评价措施

Evaluation Measures for Multi-Class Subgroup Discovery
课程网址: http://videolectures.net/ecmlpkdd09_abudawood_emmcsd/  
主讲教师: Tarek Abudawood
开课单位: 布里斯托大学
开课时间: 2009-10-20
课程语种: 英语
中文简介:
子组发现旨在发现其类别分布与总体分布显着不同的人口子集。它以前主要是在两类背景下进行调查。本文研究了多类子组发现方法。我们考虑了针对多类子群的六种评估方法,其中四种是新的,并研究它们的理论属性。我们扩展了两级子组发现算法CN2-SD,以包含新的评估测量和受AdaBoost启发的新加权方案。我们通过实验证明了多类子组发现的有用性,使用已发现的子组作为决策树学习器的特征。决策树的叶子数量不仅平均减少了8到16倍,而且通过特定的评估措施和设置可以显着提高准确度和AUC。使用朴素贝叶斯时可以观察到类似的性能改进。
课程简介: Subgroup discovery aims at finding subsets of a population whose class distribution is significantly different from the overall distribution. It has previously predominantly been investigated in a two-class context. This paper investigates multi-class subgroup discovery methods. We consider six evaluation measures for multi-class subgroups, four of them new, and study their theoretical properties. We extend the two-class subgroup discovery algorithm CN2-SD to incorporate the new evaluation measures and a new weighting scheme inspired by AdaBoost. We demonstrate the usefulness of multi-class subgroup discovery experimentally, using discovered subgroups as features for a decision tree learner. Not only is the number of leaves of the decision tree reduced with a factor between 8 and 16 on average, but significant improvements in accuracy and AUC are achieved with particular evaluation measures and settings. Similar performance improvements can be observed when using naive Bayes.
关 键 词: 多类子群; 决策树学习者特征; 评价措施
课程来源: 视频讲座网
最后编审: 2020-11-13:yumf
阅读次数: 57