0


一种改进NLP实体标注的本体驱动概率软逻辑方法

An Ontology-Driven Probabilistic Soft Logic Approach to Improve NLP Entity Annotations
课程网址: http://videolectures.net/iswc2018_rospocher_ontology_driven_soft/  
主讲教师: Marco Rospocher
开课单位: 布鲁诺·凯斯勒基金会
开课时间: 2018-11-22
课程语种: 英语
中文简介:
许多知识提取和本体群体的方法依赖于众所周知的自然语言处理(NLP)任务,如命名实体识别和分类(NERC)和实体链接(EL),以识别和语义表征自然语言文本中提到的实体。尽管本质上是相关的,但这些任务执行的分析是不同的,结合它们的输出可能会导致NLP注释不可信,甚至是冲突的,考虑到关于实体的共同世界知识。在本文中,我们提出了一种概率软逻辑(PSL)模型,该模型利用本体论实体类来关联来自不同任务的NLP注释,这些任务坚持相同的实体提及。该模型背后的直觉是,注释暗示了由提及标识的实体上的一些本体类,而来自同一提及的不同任务的注释必须或多或少共享相同的隐含实体类。在各种NLP工具在一次提及时返回多个置信加权候选注释的设置中,该模型可操作地应用于比较不同的注释组合,并可能修改工具的最佳注释选择。我们在三个不同的数据集上使用NERC和EL的两个最先进工具生成的候选注释对模型进行了实验。结果表明,我们的PSL模型建议的联合注释修订一致地提高了两个工具的原始分数。
课程简介: Many approaches for Knowledge Extraction and Ontology Population rely on well-known Natural Language Processing (NLP) tasks, such as Named Entity Recognition and Classification (NERC) and Entity Linking (EL), to identify and semantically characterize the entities mentioned in natural language text. Despite being intrinsically related, the analyses performed by these tasks differ, and combining their output may result in NLP annotations that are implausible or even conflicting considering common world knowledge about entities. In this paper we present a Probabilistic Soft Logic (PSL) model that leverages ontological entity classes to relate NLP annotations from different tasks insisting on the same entity mentions. The intuition behind the model is that an annotation implies some ontological classes on the entity identified by the mention, and annotations from different tasks on the same mention have to share more or less the same implied entity classes. In a setting with various NLP tools returning multiple, confidence-weighted, candidate annotations on a single mention, the model can be operationally applied to compare the different annotation combinations, and to possibly revise the tools' best annotation choice. We experimented applying the model with the candidate annotations produced by two state-of-the-art tools for NERC and EL, on three different datasets. The results show that the joint annotation revision suggested by our PSL model consistently improves the original scores of the two tools.
关 键 词: 提取和本体群体的方法; 自然语言处理; 语义表征自然语言文本; PSL模型建议
课程来源: 视频讲座网
数据采集: 2023-01-11:cyh
最后编审: 2023-01-11:cyh
阅读次数: 26