建模说话人内部的变异性以提高说话人识别度][Modelling Intra-Speaker Variability for Improved Speaker Recognition]_MOOC(慕课)境外开放课程

   首页 → 语言学
   首页 → 文学
   首页 → 社会学

建模说话人内部的变异性以提高说话人识别度 Modelling Intra-Speaker Variability for Improved Speaker Recognition


课程网址:	http://videolectures.net/slsfs05_aronowitz_misvi/
主讲教师:	Hagai Aronowitz
开课单位:	巴伊兰大学
开课时间:	2007-02-25
课程语种:	英语
中文简介:	在本文中，我们提出了一种说话人识别算法，可以对说话人之间的会话间可变性进行显式建模。由声道，噪声和暂时的说话人特征（情绪，疲劳等）引起的这种可变性没有通过最新的说话人识别算法来明确建模。我们定义一个会话空间，其中每个会话（训练或测试口语）都是向量。然后，我们计算会话空间的旋转，对于该会话空间，估计的扬声器内子空间被微不足道隔离并且可以显式建模。由于会话空间的高维度，不可能使用标准正交化方法。因此，我们使用基于Givens旋转的QR因式分解来计算投影。与经典的GMM先进算法相比，在NIST 2004评估语料库上，识别错误率降低了23％。
课程简介:	In this paper we present a speaker recognition algorithm that models explicitly intra-speaker inter-session variability. Such variability, which is caused by channel, noise and temporary speaker characteristics (mood, fatigue, etc.), is not modeled explicitly by the state-of-the-art speaker recognition algorithms. We define a session-space in which each session (either train or test spoken utterance) is a vector. We then calculate a rotation of the session-space for which the estimated intra-speaker subspace is trivially isolated and can be modeled explicitly. Due to the high dimensionality of the session-space, it is impossible to use standard orthogonalization methods. We therefore used QR factorization based on Givens rotations to calculate the projection. On the NIST-2004 evaluation corpus, recognition error rate was reduced by 23% compared to the classic GMM state-of-the-art algorithm.
关键词:	说话人识别算法; 显式建模; 会话空间
课程来源:	视频讲座网
最后编审:	2019-09-21：cwx
阅读次数:	57

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆305室
系统研发：图书馆303室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2025 © 浙大宁波理工学院图书馆