0


学习翻译:统计和计算分析

Learning to Translate: statistical and computational analysis
课程网址: http://videolectures.net/smartdw09_turchi_ltt/  
主讲教师: Marco Turchi
开课单位: 布里斯托大学
开课时间: 2009-07-01
课程语种: 英语
中文简介:
本文从统计机器翻译系统Moses的学习能力出发,对其进行了广泛的实验研究。通过高性能计算,得到了非常精确的学习曲线,并对系统在不同条件下的预测性能进行了外推。我们的实验表明:1.系统的表示能力目前不是对其性能的限制。从有限的I.I.D.数据集推断出其模型,是造成当前性能限制的原因。增加数据集的大小不太可能导致显著的改进(至少在传统的I.I.D.设置中如此),\\4。新的统计估计方法不太可能带来显著的改进。\\n当前的性能墙主要是Ziff定律的结果,在设计统计机器翻译系统时应考虑到这一点。在此基础上,讨论了语言规则在模型推理阶段的整合以及主动学习过程的发展等几个可能的研究方向。
课程简介: In this talk, an extensive experimental study of a Statistical Machine Translation system, Moses, from the point of view of its learning capabilities is presented. Very accurate Learning Curves are obtained, by using high-performance computing, and extrapolations of the projected performance of thesystem under different conditions are provided. Our experiments suggest: 1. The representation power of the system is not currently a limitation to its performance,\\ 2. The inference of its models from finite sets of i.i.d. data is responsible for current performance limitations,\\ 3. It is unlikely that increasing dataset sizes will result in significant improvements (at least in traditional i.i.d. setting),\\ 4. It is unlikely that novel statistical estimation methods will result in significant improvements.\\ The current performance wall is mostly a consequence of Zipf's law, and this should be taken into account when designing a statistical machine translation system. A few possible research directions are discussed as a result of this investigation, most notably the integration of linguistic rules into the model inference phase, and the development of active learning procedures.
关 键 词: 计算分析; 统计数据; 统计机器
课程来源: 视频讲座网
最后编审: 2020-05-30:张荧(课程编辑志愿者)
阅读次数: 40