0


语音分解成语言特点

Factoring Speech into Linguistic Features
课程网址: http://videolectures.net/clsp_livescu_linguistic/  
主讲教师: Karen Livescu
开课单位: 芝加哥丰田技术学院
开课时间: 2012-02-15
课程语种: 英语
中文简介:
诸如自动语音识别和合成之类的口语技术通常将语音视为一串“电话”。相反,人类通过半独立关节轨迹的复杂组合产生语音。最近的音韵学理论承认这一点,并将语音视为多个语言“特征”流的组合。在本次演讲中,我将介绍将语音分解为特征在语音识别,音频和视觉(唇读)设置中有用的方法。主要贡献是使用动态贝叶斯网络的基于特征的发音建模方法。在这类模型中,会话语音中出现的各种发音被解释为特征流之间的异步和各个特征值的变化的结果。我还将讨论通过特定于特征的分类器在观察建模中使用语言特征。我将描述这些想法在音频和视觉语音识别实验中的应用,以及目前的分析表明在语音科学和技术方面的其他潜在应用。
课程简介: Spoken language technologies, such as automatic speech recognition and synthesis, typically treat speech as a string of "phones". In contrast, humans produce speech through a complex combination of semi-independent articulatory trajectories. Recent theories of phonology acknowledge this, and treat speech as a combination of multiple streams of linguistic "features". In this talk I will present ways in which the factorization of speech into features can be useful in speech recognition, in both audio and visual (lipreading) settings. The main contribution is a feature-based approach to pronunciation modeling, using dynamic Bayesian networks. In this class of models, the great variety of pronunciations seen in conversational speech is explained as the result of asynchrony among feature streams and changes in individual feature values. I will also discuss the use of linguistic features in observation modeling via feature-specific classifiers. I will describe the application of these ideas in experiments with audio and visual speech recognition, and present analyses suggesting additional potential applications in speech science and technology.
关 键 词: 语音识别; 发音建模; 动态贝叶斯网络
课程来源: 视频讲座网
最后编审: 2020-09-25:yumf
阅读次数: 60