0


经济增长:高实用项集挖掘算法

UP-Growth: An Efficient Algorithm for High Utility Itemset Mining
课程网址: http://videolectures.net/kdd2010_wu_uge/  
主讲教师: Cheng-Wei Wu
开课单位: 国立成功大学
开课时间: 2010-10-01
课程语种: 英语
中文简介:
从事务数据库中挖掘高效用项集是指发现具有高效用的项集, 如利润。虽然近年来提出了一些相关的方法, 但它们产生了为高效用项集生成大量候选项集的问题。如此大量的候选项集会降低挖掘性能方面的执行时间和空间要求。当数据库包含大量长事务或长高实用程序项集时, 情况可能会变得更糟。在本文中, 我们提出了一种有效的算法, 即 up 增长 (实用新型增长), 用于挖掘高效用项集, 并采用一组技术来修剪候选项集。高实用程序项集的信息保存在一个名为 up-tree (实用程序模式树) 的特殊数据结构中, 这样, 只需对数据库进行两次扫描, 就可以有效地生成候选项集。与不同类型数据集上最先进的算法进行比较, 对 up-fevnd 的性能进行了评估。实验结果表明, up-spleg"不仅有效地减少了候选算法的数量, 而且在执行时间方面优于其他算法, 特别是在数据库包含大量长事务的情况下。
课程简介: Mining high utility itemsets from a transactional database refers to the discovery of itemsets with high utility like profits. Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. Such a large number of candidate itemsets degrades the mining performance in terms of execution time and space requirement. The situation may become worse when the database contains lots of long transactions or long high utility itemsets. In this paper, we propose an efficient algorithm, namely UP-Growth (Utility Pattern Growth), for mining high utility itemsets with a set of techniques for pruning candidate itemsets. The information of high utility itemsets is maintained in a special data structure named UP-Tree (Utility Pattern Tree) such that the candidate itemsets can be generated efficiently with only two scans of the database. The performance of UP-Growth was evaluated in comparison with the state-of-the-art algorithms on different types of datasets. The experimental results show that UP-Growth not only reduces the number of candidates effectively but also outperforms other algorithms substantially in terms of execution time, especially when the database contains lots of long transactions.
关 键 词: 频繁项集挖掘; 事务数据库; 集高技术
课程来源: 视频讲座网
最后编审: 2020-06-12:yumf
阅读次数: 46