一种保结构图割的局部算法A Local Algorithm for StructurePreserving Graph Cut |
|
课程网址: | http://videolectures.net/kdd2017_zhou_local_algorithm/ |
主讲教师: | 周大伟 |
开课单位: | 亚利桑那州立大学 |
开课时间: | 2017-10-09 |
课程语种: | 英语 |
中文简介: | 如今,大规模的图数据正在各种现实世界的应用中生成,从社交网络到合着网络,从蛋白质-蛋白质相互作用网络到道路交通网络。现有的许多图挖掘工作都集中在顶点和边上,以一阶马尔可夫链作为底层模型。他们未能探索高阶网络结构,而这在许多高影响领域中至关重要。例如,在银行客户个人身份信息(PII)网络中,星形结构通常对应于一组合成身份;在金融交易网络中,循环结构可能表明洗钱活动的存在。在本文中, 寻找结构丰富的子图的一个关键挑战是高昂的计算成本。为了解决这个问题,受到一系列本地图聚类算法的启发,这些算法可以在不探索整个图的情况下有效地识别低电导切割,我们建议推广建模高阶网络结构的关键思想。特别是,我们从高阶电导的通用定义开始,并定义高阶扩散核心,它基于用户指定的高阶网络结构引起的高阶随机游走。然后,我们提出了一种新颖的高阶结构保留局部切割(HOSPLOC)算法,该算法相对于图中边的数量以多对数时间运行。它从种子顶点开始,迭代地探索其邻域,直到找到具有较小高阶电导的子图。此外,我们还从有效性和效率方面分析其性能。合成图和真实图的实验结果证明了我们提出的 HOSPLOC 算法的有效性和效率。 |
课程简介: | Nowadays, large-scale graph data is being generated in a variety of real-world applications, from social networks to co-authorship networks, from protein-protein interaction networks to road traffic networks. Many existing works on graph mining focus on the vertices and edges, with the first-order Markov chain as the underlying model. They fail to explore the high-order network structures, which are of key importance in many high impact domains. For example, in bank customer personally identifiable information (PII) networks, the star structures often correspond to a set of synthetic identities; in financial transaction networks, the loop structures may indicate the existence of money laundering. In this paper, we focus on mining user-specified high-order network structures and aim to find a structure-rich subgraph which does not break many such structures by separating the subgraph from the rest. A key challenge associated with finding a structure-rich subgraph is the prohibitive computational cost. To address this problem, inspired by the family of local graph clustering algorithms for efficiently identifying a low-conductance cut without exploring the entire graph, we propose to generalize the key idea to model high-order network structures. In particular, we start with a generic definition of high-order conductance, and define the high-order diffusion core, which is based on a high-order random walk induced by user-specified high-order network structure. Then we propose a novel High-Order Structure-Preserving LOcal Cut (HOSPLOC) algorithm, which runs in polylogarithmic time with respect to the number of edges in the graph. It starts with a seed vertex and iteratively explores its neighborhood until a subgraph with a small high-order conductance is found. Furthermore, we analyze its performance in terms of both effectiveness and efficiency. The experimental results on both synthetic graphs and real graphs demonstrate the effectiveness and efficiency of our proposed HOSPLOC algorithm. |
关 键 词: | 局部算法; 图数据; 底层模型 |
课程来源: | 视频讲座网 |
数据采集: | 2023-12-26:wujk |
最后编审: | 2023-12-26:wujk |
阅读次数: | 24 |