0


学习的相似性度量的社会媒体事件识别

Learning Similarity Metrics for Event Identification in Social Media
课程网址: http://videolectures.net/wsdm2010_becker_lsmf/  
主讲教师: Hila Becker
开课单位: 哥伦比亚大学
开课时间: 2010-10-12
课程语种: 英语
中文简介:
社交媒体网站(例如,Flickr,YouTube和Facebook)是希望在网络上分享他们的经验和兴趣的用户的流行分发渠道。这些站点承载大量用户贡献的材料(例如,照片,视频和文本内容),用于各种类型和规模的各种现实世界事件。通过自动识别这些事件及其相关的用户提供的社交媒体文档(本文的重点),我们可以在最先进的搜索引擎中启用事件浏览和搜索。为了解决这个问题,我们利用了丰富的“上下文”。与社交媒体内容相关联,包括用户提供的注释(例如,标题,标签)和自动生成的信息(例如,内容创建时间)。使用包含文本和非文本功能的丰富上下文,我们可以定义适当的文档相似性度量,以实现媒体到事件的在线群集。作为本文的重要贡献,我们探索了各种以原则方式学习社交媒体文档的多特征相似性度量的技术。我们在Flickr的大型真实世界数据集上评估我们的技术。我们的评估结果表明,我们的方法比我们构建的最先进的策略更有效地识别事件及其相关的社交媒体文档。
课程简介: Social media sites (e.g., Flickr, YouTube, and Facebook) are a popular distribution outlet for users looking to share their experiences and interests on the Web. These sites host substantial amounts of user-contributed materials (e.g., photographs, videos, and textual content) for a wide variety of real-world events of different type and scale. By automatically identifying these events and their associated user-contributed social media documents, which is the focus of this paper, we can enable event browsing and search in state-of-the-art search engines. To address this problem, we exploit the rich “context” associated with social media con- tent, including user-provided annotations (e.g., title, tags) and automatically generated information (e.g., content creation time). Using this rich context, which includes both textual and non-textual features, we can define appropriate document similarity metrics to enable online clustering of media to events. As a key contribution of this paper, we explore a variety of techniques for learning multi-feature similarity metrics for social media documents in a principled manner. We evaluate our techniques on large-scale, real- world datasets of event images from Flickr. Our evaluation results suggest that our approach identifies events, and their associated social media documents, more effectively than the state-of-the-art strategies on which we build.
关 键 词: 社会媒体网站; 材料; 自动识别; 事件浏览; 搜索引擎
课程来源: 视频讲座网
最后编审: 2020-11-13:yumf
阅读次数: 76