0


大型属性图的快速尽力而为模式匹配

Fast Best-Effort Pattern Matching in Large Attributed Graphs
课程网址: http://videolectures.net/kdd07_tong_fbep/  
主讲教师: Hanghang Tong
开课单位: 亚利桑那州立大学
开课时间: 2007-08-14
课程语种: 英语
中文简介:

我们专注于节点具有属性的大型图形,例如社交网络,其中节点用每个人的职称进行标记。在这样的设置中,我们希望找到与用户查询模式匹配的子图。例如,“星级”查询将是“找到与经理,律师和会计师或与之尽可能接近的其他组织有很强互动性的首席执行官”。同样,“循环”查询可以帮助发现洗钱圈。当不存在精确匹配时,传统的基于SQL的方法以及最新的图形索引方法将不返回任何答案。我们的方法可以找到完全匹配以及接近匹配,然后将其按照我们建议的“良好”顺序呈现给用户。例如,当直接路径不存在时,我们的方法可以容忍上述示例查询的“ CEO”和“会计”之间的间接路径。它的第二个功能是可伸缩性。通常,如果查询具有nq个节点,而数据图具有n个节点,则该问题需要多项式时间复杂度O(nnq),这是令人望而却步的。我们的G射线(“图形X射线”)方法可以在时间上找到高质量的子图,这些子图与数据图的大小呈线性关系。 DLBP作者发布图(具有356K节点和1.9M边缘)上的实验结果说明了我们方法的有效性和可扩展性。结果符合我们的直觉,并且速度非常好。在DBLP图表上进行4节点查询平均需要4秒钟。

课程简介: We focus on large graphs where nodes have attributes, such as a social network where the nodes are labelled with each person’s job title. In such a setting, we want to find subgraphs that match a user query pattern. For example, a ‘star’ query would be, “find a CEO who has strong interactions with a Manager, a Lawyer, and an Accountant, or another structure as close to that as possible”. Similarly, a ‘loop’ query could help spot a money laundering ring. Traditional SQL-based methods, as well as more recent graph indexing methods, will return no answer when an exact match does not exist. Our method can find exact-, as well as near-matches, and it will present them to the user in our proposed ‘goodness’ order. For example, our method tolerates indirect paths between, say, the ‘CEO’ and the ‘Accountant’ of the above sample query, when direct paths do not exist. Its second feature is scalability. In general, if the query has nq nodes and the data graph has n nodes, the problem needs polynomial time complexity O(nnq ), which is prohibitive. Our G-Ray (“Graph X-Ray”) method finds high-quality subgraphs in time linear on the size of the data graph. Experimental results on the DLBP author-publication graph (with 356K nodes and 1.9M edges) illustrate both the effectiveness and scalability of our approach. The results agree with our intuition, and the speed is excellent. It takes 4 seconds on average for a 4- node query on the DBLP graph.
关 键 词: 节点; 数据图; 属性图
课程来源: 视频讲座网
数据采集: 2021-02-16:nkq
最后编审: 2021-02-16:nkq
阅读次数: 68