0


SOFIE:自组织柔性信息抽取

SOFIE: Self-Organizing Flexible Information Extraction
课程网址: http://videolectures.net/www09_suchanek_sofie/  
主讲教师: Mauro Sozio, Fabian M. Suchanek, Gerhard Weikum
开课单位: 马克斯普朗克研究所
开课时间: 2009-05-20
课程语种: 英语
中文简介:
本文介绍了SOFIE,该系统可以通过新事实扩展现有的本体。 SOFIE提供了一个集成框架,其中信息提取,单词歧义消除和语义推理都成为一个统一模型的一部分。 SOFIE处理文本或Web源并找到有意义的模式。它将模式中的单词映射到本体中的实体。它假设模式的含义,并用现有的本体检查假设的语义合理性。然后,将新事实添加到本体中,避免与现有事实不一致。连接现有事实,新假设,提取模式和一致性约束的逻辑模型表示为一组命题子句。对于加权MAX SAT问题,我们使用一种近似算法来计算最合理的假设子集。因此,SOFIE框架将模式匹配,实体歧义消除和本体推理的范式集成到一个统一的模型中,并实现了大型本体的自动化增长。使用YAGO本体作为现有知识并使用各种文本和Web语料库作为输入源的实验表明,我们的方法可产生约90%或更高的非常好的精度。
课程简介: This paper presents SOFIE, a system that can extend an existing ontology by new facts. SOFIE provides a integrative framework, in which information extraction, word disambiguation and semantic reasoning all become part of one unifying model. SOFIE processes text or Web sources and finds meaningful patterns. It maps the words in the pattern to entities in the ontology. It hypothesizes on the meaning of the pattern, and checks the semantic plausibility of the hypothesis with the existing ontology. Then the new fact is added to the ontology, avoiding inconsistency with the existing facts. The logical model that connects existing facts, new hypotheses, extraction patterns, and consistency constraints is represented as a set of propositional clauses. We use an approximation algorithm for the Weighted MAX SAT problem to compute the most plausible subset of hypotheses. Thereby, the SOFIE framework integrates the paradigms of pattern matching, entity disambiguation, and ontological reasoning into one unified model, and enables the automated growth of large ontologies. Experiments, using the YAGO ontology as existing knowledge and various text and Web corpora as input sources, show that our method yields very good precision around 90 percent or higher.
关 键 词: 集成框架; 统一模型; 单词映射
课程来源: 视频讲座网
最后编审: 2020-06-06:章泽平(课程编辑志愿者)
阅读次数: 62