0


使用骨干细化类的大规模图挖掘

Large-Scale Graph Mining Using Backbone Refinement Classes
课程网址: http://videolectures.net/kdd09_maunz_lsgmubrc/  
主讲教师: Andreas Maunz
开课单位: 弗莱堡大学
开课时间: 信息不详。欢迎您在右侧留言补充。
课程语种: 英语
中文简介:
提出了一种基于骨干细化类的大规模图挖掘方法。该方法在最小频率和重要性约束下有效地挖掘树形子图描述子,利用碎片类减少特征集大小和运行时间。这些类是根据共享公共主干的片段定义的。该方法能够优化结构特征间熵,而不是出现次数,这是开放式或封闭式碎片挖掘的特点。在实验中,与完全树挖掘和开放树挖掘相比,本文提出的方法分别减少了90%和30%的特征集大小。使用交叉验证运行进行评估表明,它们的分类精度与完整的树集相似,但明显优于开放树。与开放式或封闭式碎片挖掘相比,由于改进的统计约束(动态上限调整),搜索空间的很大一部分可以被修剪,与普通(静态)上限修剪相比,在下行时间的实验中也证实了这一点。使用大规模数据集的进一步分析可以深入了解所提出的描述符的重要属性,例如数据集覆盖率和由每个描述符表示的类大小。最终的交叉验证运行证实了新描述符使之前可能难以处理的大型训练集变得可行。
课程简介: We present a new approach to large-scale graph mining based on so-called backbone refinement classes. The method efficiently mines tree-shaped subgraph descriptors under minimum frequency and significance constraints, using classes of fragments to reduce feature set size and running times. The classes are defined in terms of fragments sharing a common backbone. The method is able to optimize structural inter-feature entropy as opposed to occurrences, which is characteristic for open or closed fragment mining. In the experiments, the proposed method reduces feature set sizes by >90 % and >30 % compared to complete tree mining and open tree mining, respectively. Evaluation using crossvalidation runs shows that their classification accuracy is similar to the complete set of trees but significantly better than that of open trees. Compared to open or closed fragment mining, a large part of the search space can be pruned due to an improved statistical constraint (dynamic upper bound adjustment), which is also confirmed in the experiments in lower running times compared to ordinary (static) upper bound pruning. Further analysis using large-scale datasets yields insight into important properties of the proposed descriptors, such as the dataset coverage and the class size represented by each descriptor. A final cross-validation run confirms that the novel descriptors render large training sets feasible which previously might have been intractable.
关 键 词: 计算机科学; 数据挖掘; 图挖掘
课程来源: 视频讲座网
最后编审: 2019-11-18:cwx
阅读次数: 33