0


连接新闻文章之间的点

Connecting the Dots Between News Articles
课程网址: http://videolectures.net/kdd2010_shahaf_cdb/  
主讲教师: Dafna Shahaf
开课单位: 卡内基梅隆大学
开课时间: 2010-10-01
课程语种: 英语
中文简介:
从大数据集中提取有用知识的过程已成为当今社会最紧迫的问题之一。这个问题涉及到整个行业,从科学家到情报分析师和网络用户,所有这些人都在不断努力跟上每天发布的越来越多的内容。有了这么多的数据,往往很容易错过大局。在本文中,我们研究了自动连接点的方法——提供了一种结构化的、简单的方法来导航新主题并发现隐藏的连接。我们关注新闻领域:给定两篇新闻文章,我们的系统会自动找到将它们链接在一起的一致链。例如,它可以恢复从房价下跌(2007年1月)开始到正在进行的医疗保健辩论结束的一连串事件。我们将一个好的链的特性形式化,并提供一个连接两个固定端点的有效算法(具有理论保证)。我们将用户反馈整合到我们的框架中,允许对故事进行精炼和个性化。最后,我们对实际新闻数据进行了评估。我们的用户研究证明了该算法在帮助用户理解新闻方面的有效性。
课程简介: The process of extracting useful knowledge from large datasets has become one of the most pressing problems in today's society. The problem spans entire sectors, from scientists to intelligence analysts and web users, all of whom are constantly struggling to keep up with the larger and larger amounts of content published every day. With this much data, it is often easy to miss the big picture. In this paper, we investigate methods for automatically connecting the dots -- providing a structured, easy way to navigate within a new topic and discover hidden connections. We focus on the news domain: given two news articles, our system automatically finds a coherent chain linking them together. For example, it can recover the chain of events starting with the decline of home prices (January 2007), and ending with the ongoing health-care debate. We formalize the characteristics of a good chain and provide an efficient algorithm (with theoretical guarantees) to connect two fixed endpoints. We incorporate user feedback into our framework, allowing the stories to be refined and personalized. Finally, we evaluate our algorithm over real news data. Our user studies demonstrate the algorithm's effectiveness in helping users understanding the news.
关 键 词: 自动连接点; 结构化; 有效算法; 个性化
课程来源: 视频讲座网
最后编审: 2019-12-21:lxf
阅读次数: 36