0


QA4IE:一种基于问答的信息提取框架

QA4IE: A Question Answering based Framework for Information Extraction
课程网址: http://videolectures.net/iswc2018_qiu_qa4ie_question/  
主讲教师: Lin Qiu
开课单位: 上海交通大学
开课时间: 2018-10-22
课程语种: 英语
中文简介:
信息提取(IE)是指从非结构化文本中自动提取结构化关系元组。常见的IE解决方案,包括关系提取(RE)和开放式IE系统,很难处理跨句子元组,并且受到有限关系类型和非正式关系规范(例如,基于自由文本的关系元组)的严重限制。为了克服这些缺点,我们提出了一个名为QA4IE的新型IE框架,该框架利用灵活的问答(QA)方法来生成跨句子的高质量关系三元组。基于该框架,我们开发了一个具有高质量人类评估的大型IE基准。该基准包含293K个文档、2M个黄金关系三元组和636个关系类型。我们将我们的系统与基准上的一些IE基线进行了比较,结果表明我们的系统实现了很大的改进。
课程简介: Information Extraction (IE) refers to automatically extracting structured relation tuples from unstructured texts. Common IE solutions, including Relation Extraction (RE) and open IE systems, can hardly handle cross-sentence tuples, and are severely restricted by limited relation types as well as informal relation specifications (e.g., free-text based relation tuples). In order to overcome these weaknesses, we propose a novel IE framework named QA4IE, which leverages the flexible question answering (QA) approaches to produce high quality relation triples across sentences. Based on the framework, we develop a large IE benchmark with high quality human evaluation. This benchmark contains 293K documents, 2M golden relation triples, and 636 relation types. We compare our system with some IE baselines on our benchmark and the results show that our system achieves great improvements.
关 键 词: 从非结构化文本中自动提取; 结构化关系元组; IE解决方案; 灵活的问答
课程来源: 视频讲座网
数据采集: 2022-12-30:cyh
最后编审: 2023-05-15:cyh
阅读次数: 26