0


SABINE:语义注释社交内容的多用途数据集

SABINE: A Multi-Purpose Dataset of Semantically-Annotated Social Content
课程网址: http://videolectures.net/iswc2018_ferrara_sabine_multi_purpose/  
主讲教师: Alfio Ferrara
开课单位: 米兰大学
开课时间: 2018-11-22
课程语种: 英语
中文简介:
社交商业智能(SBI)是一门将企业数据与社交内容相结合的学科,让决策者分析从环境中感知到的趋势。SBI在若干领域提出了研究挑战,如IR、数据挖掘和NLP;不幸的是,SBI的研究常常受到缺乏可供实验方法使用的公开的真实世界数据以及难以确定基本事实的限制。为了填补这一空白,我们提出了SABINE,一个欧洲政治领域的模块化数据集。SABINE包括从5万个网络资源中抓取的600万个双语剪辑,每个剪辑都与元数据和情感评分相关;包含400个主题的本体,它们在剪辑中的出现,以及它们到DBpedia的映射;用于分析和聚合情感和语义事件的两个多维立方体。我们还提出了一系列可以使用SABINE解决的研究挑战;值得注意的是,专家验证的基本事实的存在确保了对整个SBI过程以及每一项任务进行测试的可能性。
课程简介: Social Business Intelligence (SBI) is the discipline that combines corporate data with social content to let decision makers analyze the trends perceived from the environment. SBI poses research challenges in several areas, such as IR, data mining, and NLP; unfortunately, SBI research is often restrained by the lack of publicly-available, real-world data for experimenting approaches, and by the difficulties in determining a ground truth. To fill this gap we present SABINE, a modular dataset in the domain of European politics. SABINE includes 6 millions bilingual clips crawled from 50 000 web sources, each associated with metadata and sentiment scores; an ontology with 400 topics, their occurrences in the clips, and their mapping to DBpedia; two multidimensional cubes for analyzing and aggregating sentiment and semantic occurrences. We also propose a set of research challenges that can be addressed using SABINE; remarkably, the presence of an expert-validated ground truth ensures the possibility of testing approaches to the whole SBI process as well as to each single task.
关 键 词: 社交商业智能; 分析和聚合情感和语义事件; 数据挖掘和NLP
课程来源: 视频讲座网
数据采集: 2022-12-20:cyh
最后编审: 2022-12-23:cyh
阅读次数: 12