基于分层多流后验的语音识别系统Hierarchical Multi-Stream Posterior Based Speech Recognition System |
|
课程网址: | http://videolectures.net/mlmi04uk_ketabdar_hmspb/ |
主讲教师: | Hamed Ketabdar |
开课单位: | 瑞士联邦理工学院 |
开课时间: | 2007-02-25 |
课程语种: | 英语 |
中文简介: | 在本文中,我们通过使用多个特征流估计更多信息性的后验并考虑声学上下文(例如,整个话语中可用的)以及可能的先验信息(例如,这样)来提出基于后验的语音识别系统的初步结果。作为拓扑约束)。这些后验是基于\状态伽马后验“定义(通常用于标准HMMs训练)估计的,扩展到多流HMM的情况。这种方法为后验层次估计/使用提供了一个新的,有原则的理论框架,多流特征组合,并在后验估计中整合适当的上下文和先验知识。在目前的工作中,我们使用得到的伽玛后验作为标准HMM / GMM层的特征。在OGI数字数据库和减少的词汇量版本(1000字) )与DARPA会话电话语音到文本(CTS)任务相比,与现有技术的Tandem系统相比,这导致了显着的性能提升。 |
课程简介: | In this paper, we present initial results towards boosting posterior based speech recognition systems by estimating more informative posteriors using multiple streams of features and taking into account acoustic context (e.g., as available in the whole utterance), as well as possible prior information (such as topological constraints). These posteriors are estimated based on \state gamma posterior" de—nition (typically used in standard HMMs training) extended to the case of multi-stream HMMs.This approach provides a new, principled, theoretical framework for hierarchical estimation/use of posteriors, multi-stream feature combination, and integrating appropriate context and prior knowledge in posterior estimates. In the present work, we used the resulting gamma posteriors as features for a standard HMM/GMM layer. On the OGI Digits database and on a reduced vocabulary version (1000 words) of the DARPA Conversational Telephone Speech-to-text (CTS) task, this resulted in signi—cant performance improvement, compared to the stateof- the-art Tandem systems. |
关 键 词: | 声学; 语音识别系统; 性能提升 |
课程来源: | 视频讲座网 |
最后编审: | 2020-06-15:heyf |
阅读次数: | 42 |