动态时间扭曲的新青年Dynamic Time Warping’s New Youth |
|
课程网址: | http://ocean.nit.net.cn/subject/user/index.php?do=addmd&typeid=37 |
主讲教师: | Xavier Anguera Miro |
开课单位: | 艾尔莎公司 |
开课时间: | 2012-09-18 |
课程语种: | 英语 |
中文简介: | 在语音相关应用程序中普遍使用隐马尔可夫模型(HMM)之前,模式匹配算法(如众所周知的动态时间规整(DTW)算法[1])已广泛用于诸如语音关键词识别之类的应用程序[2]。 ]。当时,该技术的主要缺点是其计算成本(考虑到当时可用的机器)以及匹配来自不同扬声器或不同声学环境的声学序列时缺乏通用性。除了HMM之外,用于训练推送模式匹配技术的标记数据集的可用性也有所提高。尽管如此,HMM仍存在一些众所周知的弱点,例如,由于训练数据而导致的过分概括,缺乏对变化的噪声条件的鲁棒性以及需要大量具有良好标签的训练数据集,从而限制了它们对某些语音应用的适用性。因此,最近一些研究小组开始再次将DTW视为可行的替代方案,并致力于解决过去使DTW不适合使用的问题。一方面,正在研究新的声学特征[3],以使匹配尽可能独立于说话者,同时保持内容。另一方面,尽管计算能力比70年代有了很大提高,但DTW提出了一些增强功能[4,5],以实现比过去更具挑战性的任务。当前应用模式匹配(尤其是DTW)方法的一些任务包括:自动发现语音中的重复模式,以示例查询的语音搜索,基于模式的语音识别和低资源语言分析。 > |
课程简介: | Before the use of Hidden Markov Models (HMM) became ubiquitous in speech‐related applications, pattern matching algorithms like the well known Dynamic Time Warping (DTW) algorithm [1] were extensively used for applications such as spoken keyword recognition [2]. At the time, the main drawbacks of this technology were its computational cost (given the machinery available at the time) and the lack of generalization when matching acoustic sequences from different speakers or different acoustic contexts. The availability of labeled datasets used for training pushed pattern matching techniques aside in favor of HMMs. Still, HMMs have several well known weaknesses, such as overgeneralization given the training data, lack of robustness to changing noise conditions and the need to have large corpora of well‐labeled training data, limiting their suitability for some speech applications. For this reason, recently some research groups started to look again at DTW as a plausible alternative, and worked on smoothing those issues that made it unsuitable in the past. On the one hand, new acoustic features are being researched [3] to make the matching as independent as possible to the speaker, while keeping the content. On the other hand, although computing power is much improved from the 70’s, DTW several enhancements have been proposed [4,5] in order to allow for more challenging tasks than in the past. Some of the tasks where pattern‐matching (and in particular DTW) approaches are currently applied are: automatic discovery of repeated patterns in speech, query‐by‐example voice search, pattern‐based speech recognition and low‐resource languages analysis. |
关 键 词: | 数据集; 匹配算法 |
课程来源: | 视频讲座网 |
数据采集: | 2020-12-03:zyk |
最后编审: | 2020-12-03:zyk |
阅读次数: | 35 |