
Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion
课程网址: http://videolectures.net/kdd2014_murphy_knowledge_vault/  
主讲教师: Kevin P. Murphy
开课单位: 谷歌公司
开课时间: 2014-10-07
课程语种: 英语

最近几年见证了大规模知识库的激增,包括Wikipedia,Freebase,YAGO,微软的Satori和Google的Knowledge Graph。为了进一步扩大规模,我们需要探索构建知识库的自动方法。先前的方法主要集中在基于文本的提取上,这可能会非常嘈杂。在此,我们介绍Knowledge Vault,这是一个Web规模的概率知识库,它将Web内容的提取(通过对文本,表格数据,页面结构和人工注释的分析获得)与从现有知识库中获得的先验知识相结合。我们采用监督式机器学习方法来融合这些独特的信息源。知识库比以前发布的任何结构化知识库都大得多,并且具有一个概率推理系统,该系统可以计算校准后的事实正确性概率。我们报告了多项研究的结果,这些研究探索了不同信息来源和提取方法的相对效用。

课程简介: Recent years have witnessed a proliferation of large-scale knowledge bases, including Wikipedia, Freebase, YAGO, Microsoft's Satori, and Google's Knowledge Graph. To increase the scale even further, we need to explore automatic methods for constructing knowledge bases. Previous approaches have primarily focused on text-based extraction, which can be very noisy. Here we introduce Knowledge Vault, a Web-scale probabilistic knowledge base that combines extractions from Web content (obtained via analysis of text, tabular data, page structure, and human annotations) with prior knowledge derived from existing knowledge repositories. We employ supervised machine learning methods for fusing these distinct information sources. The Knowledge Vault is substantially bigger than any previously published structured knowledge repository, and features a probabilistic inference system that computes calibrated probabilities of fact correctness. We report the results of multiple studies that explore the relative utility of the different information sources and extraction methods.
关 键 词: 知识库; 概率推理; 信息提取
课程来源: 视频讲座网
数据采集: 2020-11-04:zyk
最后编审: 2020-11-04:zyk
阅读次数: 49