建模说话人内部的变异性以提高说话人识别度Modelling Intra-Speaker Variability for Improved Speaker Recognition |
|
课程网址: | http://videolectures.net/slsfs05_aronowitz_misvi/ |
主讲教师: | Hagai Aronowitz |
开课单位: | 巴伊兰大学 |
开课时间: | 2007-02-25 |
课程语种: | 英语 |
中文简介: | 在本文中,我们提出了一种说话人识别算法,可以对说话人之间的会话间可变性进行显式建模。由声道,噪声和暂时的说话人特征(情绪,疲劳等)引起的这种可变性没有通过最新的说话人识别算法来明确建模。我们定义一个会话空间,其中每个会话(训练或测试口语)都是向量。然后,我们计算会话空间的旋转,对于该会话空间,估计的扬声器内子空间被微不足道隔离并且可以显式建模。由于会话空间的高维度,不可能使用标准正交化方法。因此,我们使用基于Givens旋转的QR因式分解来计算投影。与经典的GMM先进算法相比,在NIST 2004评估语料库上,识别错误率降低了23%。 |
课程简介: | In this paper we present a speaker recognition algorithm that models explicitly intra-speaker inter-session variability. Such variability, which is caused by channel, noise and temporary speaker characteristics (mood, fatigue, etc.), is not modeled explicitly by the state-of-the-art speaker recognition algorithms. We define a session-space in which each session (either train or test spoken utterance) is a vector. We then calculate a rotation of the session-space for which the estimated intra-speaker subspace is trivially isolated and can be modeled explicitly. Due to the high dimensionality of the session-space, it is impossible to use standard orthogonalization methods. We therefore used QR factorization based on Givens rotations to calculate the projection. On the NIST-2004 evaluation corpus, recognition error rate was reduced by 23% compared to the classic GMM state-of-the-art algorithm. |
关 键 词: | 说话人识别算法; 显式建模; 会话空间 |
课程来源: | 视频讲座网 |
最后编审: | 2019-09-21:cwx |
阅读次数: | 41 |