首页计算机科学技术
   首页工程与技术科学

基于语义角色框架的多文档摘要

Semantic role frames graph-based multidocument summarization
课程网址: http://videolectures.net/sikdd2011_canhasi_multidocument/  
主讲教师: Ercan Canhasi
开课单位: 卢布尔雅那大学
开课时间: 2011-11-04
课程语种: 英语
中文简介:
多文档摘要是自动创建给定文档集合的压缩版本的过程。最近,基于图的模型和排序算法已经被提取文档摘要社区广泛研究。虽然本文中迄今为止的大多数工作都集中在句子层次关系上,但我们提出的图形模型不仅强调句子层次关系,而且强调句子层次下关系(例如句子相似度的一部分)的影响。通过使用已证明的认知心理学模型(事件索引模型)和语义角色解析来生成框架图,我们建立了区分句子层次关系的基础。基于该模型,我们在现有的PageRank算法的基础上,开发了一种迭代框架和句子排序算法。在DUC 2004数据集上进行了实验,ROUGE(面向回忆的Gisting Evaluation Understudy)评估结果证明了所提出方法的优点。
课程简介: Multi-document summarization is a process of automatic creation of a compressed version of the given collection of documents. Recently, the graph-based models and ranking algorithms have been extensively researched by the extractive document summarization community. While most work to date focuses on sentence-level relations in this paper we present graph model that emphasizes not only sentence level relations but also the influence of under sentence level relations (e.g. a part of sentence similarity). By using the proven cognitive psychology model (the Event-Indexing model) and semantic role parsing for generating the frame graph, we establish the bases for distinguishing the sentence level relations. Based on this model, we developed an iterative frame and sentence ranking algorithm, based on the existing well known PageRank algorithm. Experiments are conducted on the DUC 2004 data sets and the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) evaluation results demonstrate the advantages of the proposed approach.
关 键 词: 多文档摘要; 压缩版本; 排序算法
课程来源: 视频讲座网
数据采集: 2022-12-19:chenjy
最后编审: 2023-05-11:chenjy
阅读次数: 43