0


对文档的搜索序列的突发意识

On Burstiness-Aware Search for Document Sequences
课程网址: http://videolectures.net/kdd09_lappas_obasds/  
主讲教师: Theodoros Lappas
开课单位: 加利福尼亚大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
随着大型时间戳收藏(如数字化报纸、期刊、博客的序列)的数量和规模的增加,高效索引和搜索此类数据的问题变得更加重要。术语“突发性”作为一种处理此类集合上下文中事件检测的机制被广泛研究。本文探讨了如何进一步利用突发信息来增强搜索过程。我们提出了一种新的方法来模拟一个词的粗,使用差异理论的概念。这允许我们建立一个无参数的线性时间方法来确定给定术语的最大粗度的时间间隔。最后,我们描述了第一个由突发性驱动的搜索框架,并在不同场景下对我们的方法进行了全面的评估。
课程简介: As the number and size of large timestamped collections (e.g. sequences of digitized newspapers, periodicals, blogs) increase, the problem of efficiently indexing and searching such data becomes more important. Term burstiness has been extensively researched as a mechanism to address event detection in the context of such collections. In this paper, we explore how burstiness information can be further utilized to enhance the search process. We present a novel approach to model the burstiness of a term, using discrepancy theory concepts. This allows us to build a parameter-free, linear-time approach to identify the time intervals of maximum burstiness for a given term. Finally, we describe the first burstiness-driven search framework and thoroughly evaluate our approach in the context of different scenarios.
关 键 词: 计算机科学; 文档搜索; 数据
课程来源: 视频讲座网
最后编审: 2021-01-30:nkq
阅读次数: 53