从议会辩论中保存元数据][Preserving Metadata from Parliamentary Debates]_MOOC(慕课)境外开放课程

首页 → 语言学
首页 → 人文社会科学

从议会辩论中保存元数据 Preserving Metadata from Parliamentary Debates


课程网址:	http://videolectures.net/parlaCLARIN2018_karakanta_parliamentary_...
主讲教师:	Alina Karakanta,
开课单位:	萨尔兰大学
开课时间:	2018-05-30
课程语种:	英语
中文简介:	多语种议会已成为收集单语种和多语种语料库的有用资源。但是，通常情况下会缺少有关说话者或句子原始语言的额外文本信息，因此，这些资源无法在翻译研究中得到充分利用。在本文中，我们提出了一种处理和建立平行语料库的方法，该语料库由欧洲议会的议会辩论组成，英语为德语，英语为西班牙语。本文记录了创建如此宝贵的资源所需的所有必要步骤（前处理和后处理）。除了并行语料库，我们还使用相同的方法收集英语，德语和西班牙语的单语种可比语料库。
课程简介:	Multilingual parliaments have been a useful source for monolingual and multilingual corpus collection. However, it is often the case that extra-textual information about speakers or the original language of the sentences is absent, and as a result, these resources cannot be fully used in translation studies. In this paper we present a method for processing and building a parallel corpus consisting of parliamentary debates of the European Parliament for English into German and English into Spanish. The paper documents all necessary (pre- and post-) processing steps for creating such a valuable resource. In addition to the parallel corpora, we collect monolingual comparable corpora for English, German and Spanish using the same method.
关键词:	语言; 语料库; 元数据
课程来源:	视频讲座网
数据采集:	2020-11-02：yxd
最后编审:	2020-11-03：zyk
阅读次数:	182

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2026 © 浙大宁波理工学院图书馆