0


将近似最佳特征选择与gSpan相结合

Combining near-optimal feature selection with gSpan
课程网址: http://videolectures.net/mlg08_thoma_cnofs/  
主讲教师: Marisa Thoma
开课单位: 慕尼黑大学
开课时间: 2008-08-25
课程语种: 英语
中文简介:
图分类是许多应用领域中越来越重要的步骤,例如分子和蛋白质的功能预测,计算机化场景分析和程序流程中的异常检测。在文献中提出的各种方法中,基于频繁子图的图分类是一种流行的分支:图表被表示为(通常是二进制)向量,其中组件指示图是否包含在数据集中频繁的特定子图。然而,在大图上,人们面临着这样一个巨大的问题:这些频繁子图的数量可能随着图的大小呈指数增长,但只有少数具有足够的判别力才能使它们对图分类有用。因此,频繁子图之间的有效和有区别的特征选择是图挖掘的关键挑战。在本文中,我们提出了一种在频繁子图上进行特征选择的方法,称为CORK,它结合了两个主要优点。首先,它优化了子模块质量标准,这意味着我们可以使用贪婪特征选择产生接近最优的解决方案。其次,我们的子模块质量功能标准可以集成到gSpan中,这是用于频繁子图挖掘的最先进工具,并且有助于在频繁的子图挖掘中修剪有区别的频繁子图的搜索空间。
课程简介: Graph classification is an increasingly important step in numerous application domains, such as function prediction of molecules and proteins, computerized scene analysis, and anomaly detection in program flows. Among the various approaches proposed in the literature, graph classification based on frequent subgraphs is a popular branch: Graphs are represented as (usually binary) vectors, with components indicating whether a graph contains a particular subgraph that is frequent across the dataset. On large graphs, however, one faces the enormous problem that the number of these frequent subgraphs may grow exponentially with the size of the graphs, but only few of them possess enough discriminative power to make them useful for graph classification. Efficient and discriminative feature selection among frequent subgraphs is hence a key challenge for graph mining. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a sub modular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our sub modular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining.
关 键 词: 图分类; 频繁子图; 搜索空间
课程来源: 视频讲座网
最后编审: 2019-07-02:cjy
阅读次数: 66