0


一种基于神经网络的命名实体识别与消歧集成方法

A Novel Ensemble Method for Named Entity Recognition and Disambiguation based on Neural Network
课程网址: http://videolectures.net/iswc2018_lisena_ensemble_method_disambig...  
主讲教师: Pasquale Lisena
开课单位: 欧洲经济共同体
开课时间: 2018-11-22
课程语种: 英语
中文简介:
命名实体识别(NER)和消歧(NED)是信息提取的子任务,它们分别旨在识别文本中提到的命名实体,为它们分配预定义的类型,并将它们与知识库中的匹配实体链接起来。在过去几年中,已经提出了许多解决这些任务的方法,通常公开为web API。这些API使用不同的分类法对实体进行分类,并使用不同的知识库消除它们的歧义。在本文中,我们描述了Ensemble Nerd,这是一个框架,它收集了许多提取器响应,对它们进行标准化并进行组合,以便根据模式(表面形式、类型、链接)生成最终的实体列表。所提出的方法基于将提取器响应表示为实值向量,并将它们用作两个深度学习网络的输入样本:ENNTR(用于类型识别的集成神经网络)和ENND(用于消除歧义的集成神经网)。我们使用特定的黄金标准培训这些网络。我们表明,在GERBIL框架计算的微观和宏观F1度量方面,所生成的模型优于每个单个提取器响应。
课程简介: Named entity recognition (NER) and disambiguation (NED) are subtasks of information extraction that respectively aim to recognize named entities mentioned in text, to assign them pre-defined types, and to link them with their matching entities in a knowledge base. Many approaches, often exposed as web APIs, have been proposed to solve these tasks during the last years. These APIs classify entities using different taxonomies and disambiguate them with different knowledge bases. In this paper, we describe Ensemble Nerd, a framework that collects numerous extractors responses, normalizes them and combines them in order to produce a final entity list according to the pattern (surface form, type, link). The presented approach is based on representing the extractors responses as real-value vectors and on using them as input samples for two Deep Learning networks: ENNTR (Ensemble Neural Network for Type Recognition) and ENND (Ensemble Neural Network for Disambiguation). We train these networks using specific gold standards. We show that the models produced outperform each single extractor responses in terms of micro and macro F1 measures computed by the GERBIL framework.
关 键 词: 基于神经网络的命名实体识别; 消歧集成方法; GERBIL框架; 消除歧义的集成神经网
课程来源: 视频讲座网
数据采集: 2022-12-15:cyh
最后编审: 2023-05-15:cyh
阅读次数: 6