0


基于路径约束随机游动的检索模型快速查询执行

Fast Query Execution for Retrieval Models Based on Path-Constrained Random Walks
课程网址: http://videolectures.net/kdd2010_lao_fer/  
主讲教师: Ni Lao
开课单位: 卡内基梅隆大学
开课时间: 2010-10-01
课程语种: 英语
中文简介:
许多推荐和检索任务可以表示为标记有向图上的邻近查询,其中类型节点表示文档,术语和元数据,标记边表示它们之间的关系。最近的工作表明,通过监督学习可以提高广泛使用的基于随机游走的邻近度量的准确性,一种特别有效的学习技术基于路径约束随机游走(PCRW),其中相似性由学习的组合定义受约束的随机游走者,每个约束仅遵循远离查询节点的特定边缘标签序列。基于PCRW的方法明显优于无监督的基于随机游走的查询,以及具有学习边缘权重的模型。不幸的是,PCRW查询系统的评估成本很高。在本研究中,我们评估了近似值在PCRW分布计算中的应用,包括指纹识别,粒子滤波和截断策略。在使用两个大型科学出版物语料库的几个推荐和检索问题的实验中,我们显示了2到100的因子的加速,而准确性几乎没有损失。
课程简介: Many recommendation and retrieval tasks can be represented as proximity queries on a labeled directed graph, with typed nodes representing documents, terms, and metadata, and labeled edges representing the relationships between them. Recent work has shown that the accuracy of the widely-used random-walk-based proximity measures can be improved by supervised learning - in particular, one especially effective learning technique is based on Path-Constrained Random Walks (PCRW), in which similarity is defined by a learned combination of constrained random walkers, each constrained to follow only a particular sequence of edge labels away from the query nodes. The PCRW based method significantly outperformed unsupervised random walk based queries, and models with learned edge weights. Unfortunately, PCRW query systems are expensive to evaluate. In this study we evaluate the use of approximations to the computation of the PCRW distributions, including fingerprinting, particle filtering, and truncation strategies. In experiments on several recommendation and retrieval problems using two large scientific publications corpora we show speedups of factors of 2 to 100 with little loss in accuracy.
关 键 词: 监督学习; 约束随机游走; 学习边缘权重
课程来源: 视频讲座网
最后编审: 2019-05-11:lxf
阅读次数: 76