0


Zhishi.me - 编织中文链接开放数据

Zhishi.me - Weaving Chinese Linking Open Data
课程网址: http://videolectures.net/iswc2011_niu_zhishi/  
主讲教师: Xing Niu
开课单位: 上海交通大学
开课时间: 2011-11-25
课程语种: 汉简
中文简介:
链接开放数据(LOD)已经成为发布高质量互联语义数据的最重要的社区努力之一。这些数据已广泛应用于许多应用中,以提供实体搜索,个性化推荐等智能服务。虽然LOD核心数据源之一DBpedia包含多语言版本中描述的资源,但英语语义数据正在激增,但发布中文语义数据的工作却很少。在本文中,我们提出了Zhishi.me,它是发布大规模中文语义数据并将它们作为中文LOD(CLOD)链接在一起的第一个方法。更准确地说,我们在三个最大的中国百科全书网站(即百度百科,沪东百科和中文维基百科)中识别出重要的结构特征,并提出了几种用于自动链接发现的数据级映射策略。因此,CLOD拥有超过500万个不同的实体,我们只是根据维基百科的多语言特征将CLOD与现有的LOD联系起来。最后,我们还介绍了三个Web访问条目,即SPARQL端点,查找接口和详细数据视图,这些条目符合向LOD发布数据源的原则。
课程简介: Linking Open Data (LOD) has become one of the most important community efforts to publish high-quality interconnected semantic data. Such data has been widely used in many applications to provide intelligent services like entity search, personalized recommendation and so on. While DBpedia, one of the LOD core data sources, contains resources described in multilingual versions and semantic data in English is proliferating, there is very few work on publishing Chinese semantic data. In this paper, we present Zhishi.me, the first e ffort to publish large scale Chinese semantic data and link them together as a Chinese LOD (CLOD). More precisely, we identify important structural features in three largest Chinese encyclopedia sites (i.e., Baidu Baike, Hudong Baike, and Chinese Wikipedia) for extraction and propose several data-level mapping strategies for automatic link discovery. As a result, the CLOD has more than 5 million distinct entities and we simply link CLOD with the existing LOD based on the multilingual characteristic of Wikipedia. Finally, we also introduce three Web access entries namely SPARQL endpoint, lookup interface and detailed data view, which conform to the principles of publishing data sources to LOD.
关 键 词: 开放数据; 语义数据; 映射策略
课程来源: 视频讲座网
最后编审: 2019-05-05:lxf
阅读次数: 98