0


用于端到端语音识别的连接主义时间分类

Connectionist Temporal Classification for End-to-End Speech Recognition
课程网址: http://videolectures.net/interACT2016_metze_temporal_classificati...  
主讲教师: Florian Metze
开课单位: 卡内基梅隆大学
开课时间: 2016-07-31
课程语种: 英语
中文简介:
由于深度神经网络(DNN)的应用,自动语音识别(ASR)的性能得到了极大的提高。尽管取得了这些进展,但建立新的ASR系统仍然是一项具有挑战性的任务,需要各种资源、多个培训阶段和大量专业知识。在本次演讲中,我将介绍一种方法,该方法大大简化了现有基于加权有限状态传感器(WFST)的解码方法的声学模型的构建,并适用于端到端语音识别,允许对任意标准进行优化。声学建模现在涉及到学习单个递归神经网络(RNN),该网络预测上下文无关的目标(例如音节、音素或字符)。连接主义时间分类(CTC)目标函数将语音帧和标签序列之间的所有可能对齐边缘化,从而消除了对训练数据单独对齐的需要。我们提出了一种基于加权有限状态变换器(WFST)的广义解码方法,该方法能够将词汇和语言模型有效地结合到CTC解码中。实验表明,与标准的混合DNN系统相比,该方法实现了最先进的字错误率,同时大大降低了复杂度并加快了解码速度。
课程简介: The performance of automatic speech recognition (ASR) has improved tremendously due to the application of deep neural networks (DNNs). Despite this progress, building a new ASR system remains a challenging task, requiring various resources, multiple training stages and significant expertise. In this talk, I will present an approach that drastically simplifies building acoustic models for the existing weighted finite state transducer (WFST) based decoding approach, and lends itself to end-to-end speech recognition, allowing optimization for arbitrary criteria. Acoustic modeling now involves learning a single recurrent neural network (RNN), which predicts context-independent targets (e.g., syllables, phonemes or characters). The connectionist temporal classification (CTC) objective function marginalizes over all possible alignments between speech frames and label sequences, removing the need for a separate alignment of the training data. We present a generalized decoding approach based on weighted finite-state transducers (WFSTs), which enables the efficient incorporation of lexicons and language models into CTC decoding. Experiments show that this approach achieves state-of-the-art word error rates, while drastically reducing complexity and speeding up decoding when compared to standard hybrid DNN systems.
关 键 词: 自动语音识别; 端到端语音识别; 声学建模
课程来源: 视频讲座网
数据采集: 2021-11-26:zkj
最后编审: 2021-11-26:zkj
阅读次数: 100