
Preserving Metadata from Parliamentary Debates
课程网址: http://videolectures.net/parlaCLARIN2018_karakanta_parliamentary_...  
主讲教师: Alina Karakanta
开课单位: 萨尔大学
开课时间: 2018-05-30
课程语种: 英语


课程简介: Multilingual parliaments have been a useful source for monolingual and multilingual corpus collection. However, it is often the case that extra-textual information about speakers or the original language of the sentences is absent, and as a result, these resources cannot be fully used in translation studies. In this paper we present a method for processing and building a parallel corpus consisting of parliamentary debates of the European Parliament for English into German and English into Spanish. The paper documents all necessary (pre- and post-) processing steps for creating such a valuable resource. In addition to the parallel corpora, we collect monolingual comparable corpora for English, German and Spanish using the same method.
关 键 词: 多语种议会; 多语种语料库; 单语种可比语料库
课程来源: 视频讲座网
数据采集: 2020-11-26:cjy
最后编审: 2020-11-26:cjy
阅读次数: 37