0


收获,搜索,Web知识排行

Harvesting, Searching, and Ranking Knowledge from the Web
课程网址: http://videolectures.net/wsdm09_weikum_hsrnw/  
主讲教师: Gerhard Weikum
开课单位: 马克斯普朗克研究所
开课时间: 2009-03-12
课程语种: 英语
中文简介:
将搜索引擎的功能提升到更具表现力的语义级别是主要趋势。这是通过从半结构化和自然语言Web源中对实体和关系进行大规模信息提取来实现的。此外,利用语义Web风格的本体和深入的Web资源,有助于实现将Web转变为一个全面的知识库的宏伟愿景,该知识库可以高效、高精度地进行搜索。本文介绍了针对这一目标正在进行的研究,重点介绍了我们在Yago知识库和Naga搜索引擎方面的工作,同时也介绍了相关的项目。Yago是从维基百科和WordNet中收集的大量实体和关系事实的集合,具有很高的准确性,并协调成一致的RDF风格的语义图。为了在保持高质量的同时,从Web源进一步扩展yago,基于模式的提取与基于逻辑的一致性检查结合在一个统一的框架中。NAGA提供了基于图表模板的数据搜索,具有基于图表统计语言模型的强大排名功能。高级查询和对近似匹配排序的需要带来了效率和可伸缩性方面的挑战,这些挑战是由算法和索引技术解决的。雅高是公开的,并已导入到各种其他知识管理项目,包括DBpedia。雅高与相关领域的平行项目分享了其许多目标和方法。其中包括阿凡达、cimple/dblife、dbpedia、knowitall/textranner、kylin/kog和libra技术(等等)。它们共同形成了一种令人兴奋的趋势,即提供具有语义搜索功能的综合知识库。
课程简介: There are major trends to advance the functionality of search engines to a more expressive semantic level. This is enabled by employing large-scale information extraction of entities and relationships from semistructured as well as natural-language Web sources. In addition, harnessing Semantic-Web-style ontologies and reaching into Deep-Web sources can contribute towards a grand vision of turning the Web into a comprehensive knowledge base that can be efficiently searched with high precision. This talk presents ongoing research towards this objective, with emphasis on our work on the YAGO knowledge base and the NAGA search engine but also covering related projects. YAGO is a large collection of entities and relational facts that are harvested from Wikipedia and WordNet with high accuracy and reconciled into a consistent RDF-style "semantic" graph. For further growing YAGO from Web sources while retaining its high quality, pattern-based extraction is combined with logic-based consistency checking in a unified framework. NAGA provides graph-template-based search over this data, with powerful ranking capabilities based on a statistical language model for graphs. Advanced queries and the need for ranking approximate matches pose efficiency and scalability challenges that are addressed by algorithmic and indexing techniques. YAGO is publicly available and has been imported into various other knowledge-management projects including DBpedia. YAGO shares many of its goals and methodologies with parallel projects along related lines. These include Avatar, Cimple/DBlife, DBpedia, KnowItAll/TextRunner, Kylin/KOG, and the Libra technology (and more). Together they form an exciting trend towards providing comprehensive knowledge bases with semantic search capabilities.
关 键 词: web; 计算机科学; Web挖掘
课程来源: 视频讲座网
最后编审: 2020-07-29:yumf
阅读次数: 27