含噪声信道模型的无监督估计Unsupervised Estimation for Noisy-Channel Models |
|
课程网址: | http://videolectures.net/icml07_mylonakis_uefn/ |
主讲教师: | Markos Mylonakis |
开课单位: | 阿姆斯特丹大学 |
开课时间: | 2007-07-23 |
课程语种: | 英语 |
中文简介: | 香农â;€™的噪声信道模型描述了如何重建损坏的消息,是统计语言和语音处理中许多工作的基石。模型因素分为两部分:描述原始消息的语言模型和描述通道â;€™腐败过程的通道模型。信道模型参数估计的标准方法是无监督最大似然观测数据,通常使用期望最大化(EM)算法进行近似。本文研究表明,噪声信道两端数据的联合似然最大化是较好的。我们推导了一个相应的双向em算法,并证明它在两个任务上都比标准em有更好的性能:(1)使用概率词汇进行翻译;(2)在相关语言之间对一部分语音标记词进行改编。 |
课程简介: | Shannon’s Noisy-Channel model, which describes how a corrupted message might be reconstructed, has been the corner stone for much work in statistical language and speech processing. The model factors into two components: a language model to characterize the original message and a channel model to describe the channel’s corruptive process. The standard approach for estimating the parameters of the channel model is unsupervised Maximum-Likelihood of the observation data, usually approximated using the Expectation-Maximization (EM) algorithm. In this paper we show that it is better to maximize the joint likelihood of the data at both ends of the noisy-channel. We derive a corresponding bi-directional EM algorithm and show that it gives better performance than standard EM on two tasks: (1) translation using a probabilistic lexicon and (2) adaptation of a part-of-speech tagger between related languages. |
关 键 词: | 信道模型; 工作统计语言; 语音处理; 参数估计 |
课程来源: | 视频讲座网 |
最后编审: | 2019-12-06:lxf |
阅读次数: | 32 |