0


诱导词、短语、句子和事件的跨语言语义表达

Inducing Cross-Lingual Semantic Representations of Words, Phrases, Sentences and Events
课程网址: http://videolectures.net/nipsworkshops2012_titov_semantic_represe...  
主讲教师: Ivan Titov
开课单位: 萨尔兰大学
开课时间: 2013-01-11
课程语种: 英语
中文简介:
语言单元的交叉语言表示(例如,单词或短语)可以促进注释从资源丰富的语言转移到资源贫乏的语言,并且具有许多潜在的多语言应用(例如,机器翻译和跨语言信息检索)。在本次演讲中,我将讨论我们正在进行的工作,这些工作旨在促使主要依赖于多种语言的单语无注释文本的跨语言表达。从学习的角度来看,我们的方法最大化了单语未注释文本的可能性,但也使用了一种有利于就较小的并行数据集合(即句子及其翻译)达成一致的正则化形式。我将针对不同类型的单元(单词,短语和谓词参数结构)解决不同类型的交叉语言表示(聚类和分布式表示)的归纳。我们表明,这些模型诱导了语言上可信的语义表示,并且交叉语言归纳既有助于诱导单个语言的更好表示,又有益于各种跨语言应用。具体来说,我将考虑将文档分类任务的分类器从一种语言直接转移到另一种语言,并在低资源机器翻译的上下文中显示初步结果。
课程简介: Cross-lingual representations of linguistic units (e.g., words or phrases) can facilitate transfer of annotation from resource-rich to resource-poor languages and have many potential multilingual applications (e.g., machine translation and crosslingual information retrieval). In this talk, I will discuss our ongoing work which aims to induce cross-lingual representations relying primarily on monolingual unannotated texts readily available for many languages. From the learning standpoint, our approaches maximize the likelihood of monolingual unannotated texts but also use a form of regularization which favors agreement on a smaller collection of parallel data (i.e. sentences along with their translations). I will address the induction of different types of cross-lingual representations (clusters and distributed representations) for different types of units (words, phrases and predicateargument structures). We show that these models induce linguistically-plausible semantic representations and that cross-lingual induction both helps to induce better representations for individual languages and benefits various cross-lingual applications. Specifically, I will consider direct transfer of a classifier for a document classification task from one language to another, and show preliminary results in the context of low resource machine translation.
关 键 词: 语言单元; 交叉语言; 跨语言表达
课程来源: 视频讲座网
最后编审: 2019-09-08:lxf
阅读次数: 118