使用一个封闭系统的非冗余的子群的发现Non-Redundant Subgroup Discovery Using a Closure System |
|
课程网址: | http://videolectures.net/ecmlpkdd09_boley_nrsducs/ |
主讲教师: | Mario Boley |
开课单位: | 萨兰大学 |
开课时间: | 2009-10-20 |
课程语种: | 英语 |
中文简介: | 子组发现是本地模式发现任务,其中针对某些质量函数评估数据库的子群体的描述。由于标准质量函数是所描述的子群体的功能,我们建议在数据库中搜索关于它们的扩展的描述的等价类而不是单独的描述。这些等价类具有独特的最大代表,形成封闭系统。我们表明,在闭包系统的枚举过程中可以找到每个等价类的最小基数代表,而无需额外的成本,而找到单个等价类的最小代表是NP难的。通过几个真实世界的数据集,我们通过考虑等价类而不是单独的描述来证明搜索空间和输出显着减少,并且最小代表构成一组子组描述,其具有与传统方法生成的相同或更好的表达能力。 |
课程简介: | Subgroup discovery is a local pattern discovery task, in which descriptions of subpopulations of a database are evaluated against some quality function. As standard quality functions are functions of the described subpopulation, we propose to search for equivalence classes of descriptions with respect to their extension in the database rather than individual descriptions. These equivalence classes have unique maximal representatives forming a closure system. We show that minimum cardinality representatives of each equivalence class can be found during the enumeration process of that closure system without additional cost, while finding a minimum representative of a single equivalence class is NP-hard. With several real-world datasets we demonstrate that search space and output are significantly reduced by considering equivalence classes instead of individual descriptions and that the minimum representatives constitute a family of subgroup descriptions that is of same or better expressive power than those generated by traditional methods. |
关 键 词: | 质量函数; 亚群; 等价类 |
课程来源: | 视频讲座网 |
最后编审: | 2020-06-08:吴雨秋(课程编辑志愿者) |
阅读次数: | 61 |