Twitter中的事件检测Event Detection in Twitter |
|
课程网址: | http://videolectures.net/icwsm2011_lee_detection/ |
主讲教师: | Francis Lee |
开课单位: | HP实验室 |
开课时间: | 2011-08-11 |
课程语种: | 英语 |
中文简介: | Twitter作为一种社交媒体,近年来正在迅速兴起。用户正在使用Twitter报告现实生活中的事件。本文着重于通过分析Twitter中的文本流来检测那些事件。尽管事件检测长期以来一直是研究主题,但Twitter的特性使其成为一项不平凡的任务。报告此类事件的推文通常会被大量毫无意义的“胡言乱语”淹没。此外,鉴于纯粹的推文数量,事件检测算法需要可扩展。本文试图通过EDCoW(基于小波的信号聚类的事件检测)来应对这些挑战。 EDCoW通过对基于频率的单词原始信号进行小波分析来构建单个单词的信号。然后,通过查看它们对应的信号自动相关性,过滤掉这些琐碎的单词。然后,将剩余的单词通过基于模块的图形分区技术进行聚类以形成事件。实验研究表明,EDCoW的结果令人鼓舞。我们还介绍了概念验证系统的设计,该系统用于分析网民关于“ 2011年新加坡大选”的在线讨论。 p> |
课程简介: | Twitter, as a form of social media, is fast emerging in recent years. Users are using Twitter to report real-life events. This paper focuses on detecting those events by analyzing the text stream in Twitter. Although event detection has long been a research topic, the characteristics of Twitter make it a non- trivial task. Tweets reporting such events are usually overwhelmed by high flood of meaningless "babbles". Moreover, event detection algorithm needs to be scalable given the sheer amount of tweets. This paper attempts to tackle these challenges with EDCoW (Event Detection with Clustering of Wavelet-based Signals). EDCoW builds signals for individual words by applying wavelet analysis on the frequency-based raw signals of the words. It then filters away the trivial words by looking at their corresponding signal auto- correlations. The remaining words are then clustered to form events with a modularity-based graph partitioning technique. Experimental studies show promising result of EDCoW. We also present the design of a proof-of-concept system, which was used to analyze netizens' online discussion about Singapore General Election 2011. |
关 键 词: | Twitter; 检测算法 |
课程来源: | 视频讲座网 |
数据采集: | 2021-04-28:zyk |
最后编审: | 2021-04-28:zyk |
阅读次数: | 78 |