0


利用相关主题中文档内容的时间动态

Leveraging Temporal Dynamics of Document Content in Relevance Ranking
课程网址: http://videolectures.net/wsdm2010_elsas_ltd/  
主讲教师: Jonathan Elsas
开课单位: 卡内基梅隆大学
开课时间: 2010-03-22
课程语种: 英语
中文简介:
许多Web文档是动态的,内容以不同的频率以不同的数量变化。但是,当前的文档搜索算法对文档内容有一个静态视图,在任何时间点索引中都只有一个文档版本。本文首次提出了利用文献内容的时间动态来提高相关性排序的分析方法。我们发现,内容变化的数量和频率以及相关性之间存在着很强的关系。我们开发了一种新的概率文档排序算法,它允许基于时间特性的术语差异加权。通过利用这些内容动态,我们可以显著提高导航查询的性能。
课程简介: Many web documents are dynamic, with content changing in varying amounts at varying frequencies. However, current document search algorithms have a static view of the document content, with only a single version of the document in the index at any point in time. In this paper, we present the first published analysis of using the temporal dynamics of document content to improve relevance ranking. We show that there is a strong relationship between the amount and frequency of content change and relevance. We develop a novel probabilistic document ranking algorithm that allows differential weighting of terms based on their temporal characteristics. By leveraging such content dynamics we show significant performance improvements for navigational queries.
关 键 词: 文档搜索算法; 静态视图; 时间动态; 导航查询
课程来源: 视频讲座网
最后编审: 2019-11-11:lxf
阅读次数: 16