0


大图中的快速增量邻近搜索

Fast Incremental Proximity Search in Large Graphs
课程网址: http://videolectures.net/icml08_sarkar_slg/  
主讲教师: Purnamrita Sarkar
开课单位: 卡内基梅隆大学
开课时间: 2008-08-01
课程语种: 英语
中文简介:
本文研究了大图排序问题的两个方面。首先,我们利用采样技术对sarkar和moore(2007)中的确定性剪枝算法进行了扩充,以计算查询时基于随机游走的邻近度测量下的高概率近似正确的排序。其次,通过对随机游动的短期行为的研究,证明了这些邻近测度的一些令人惊讶的局部性质。所提出的算法可以在不缓存整个图的任何信息的情况下即时回答查询。我们在一台单CPU机器上从citeseer域得到的600000个节点的作者引文图上给出了经验结果,该机器的平均查询处理时间约为4秒。我们提出了可量化的链路预测任务。在大多数情况下,我们的技术优于个性化的pagerank,这是一种众所周知的基于扩散的接近度度量。
课程简介: In this paper we investigate two aspects of ranking problems on large graphs. First, we augment the deterministic pruning algorithm in Sarkar and Moore (2007) with sampling techniques to compute approximately correct rankings with high probability under random walk based proximity measures at query time. Second, we prove some surprising locality properties of these proximity measures by examining the short term behavior of random walks. The proposed algorithm can answer queries on the fly without caching any information about the entire graph. We present empirical results on a 600,000 node author-word-citation graph from the Citeseer domain on a single CPU machine where the average query processing time is around 4 seconds. We present quantifiable link prediction tasks. On most of them our techniques outperform Personalized Pagerank, a well-known diffusion based proximity measure.
关 键 词: 确定性剪枝算法; 采样技术; 邻近措施; 链路预测任务
课程来源: 视频讲座网
最后编审: 2019-12-07:lxf
阅读次数: 56