首页语言学
   首页文学
   首页社会学
0


建模说话人内部的变异性以提高说话人识别度

Modelling Intra-Speaker Variability for Improved Speaker Recognition
课程网址: http://videolectures.net/slsfs05_aronowitz_misvi/  
主讲教师: Hagai Aronowitz
开课单位: 巴伊兰大学
开课时间: 2007-02-25
课程语种: 英语
中文简介:
在本文中,我们提出了一种说话人识别算法,可以对说话人之间的会话间可变性进行显式建模。由声道,噪声和暂时的说话人特征(情绪,疲劳等)引起的这种可变性没有通过最新的说话人识别算法来明确建模。我们定义一个会话空间,其中每个会话(训练或测试口语)都是向量。然后,我们计算会话空间的旋转,对于该会话空间,估计的扬声器内子空间被微不足道隔离并且可以显式建模。由于会话空间的高维度,不可能使用标准正交化方法。因此,我们使用基于Givens旋转的QR因式分解来计算投影。与经典的GMM先进算法相比,在NIST 2004评估语料库上,识别错误率降低了23%。
课程简介: In this paper we present a speaker recognition algorithm that models explicitly intra-speaker inter-session variability. Such variability, which is caused by channel, noise and temporary speaker characteristics (mood, fatigue, etc.), is not modeled explicitly by the state-of-the-art speaker recognition algorithms. We define a session-space in which each session (either train or test spoken utterance) is a vector. We then calculate a rotation of the session-space for which the estimated intra-speaker subspace is trivially isolated and can be modeled explicitly. Due to the high dimensionality of the session-space, it is impossible to use standard orthogonalization methods. We therefore used QR factorization based on Givens rotations to calculate the projection. On the NIST-2004 evaluation corpus, recognition error rate was reduced by 23% compared to the classic GMM state-of-the-art algorithm.
关 键 词: 说话人识别算法; 显式建模; 会话空间
课程来源: 视频讲座网
最后编审: 2019-09-21:cwx
阅读次数: 41