0


语音处理和韵律

Speech Processing and Prosody
课程网址: http://videolectures.net/textSpeechDialogue_jouvet_speech_prosody...  
主讲教师: Denis Jouvet
开课单位: 洛林计算机科学及其应用研究实验室
开课时间: 2019-10-08
课程语种: 英语
中文简介:

语音信号的韵律传达了信息的语言内容信息:韵律构成了话语的结构,也带来了关于说话者态度和说话者情绪的信息。声音的持续时间、能量和基频是韵律特征。但是,它们的自动计算和使用并不明显。声音持续时间特征通常从语音识别结果或强制语音文本对齐中提取。尽管在干净的本地语音数据上得到的分割通常是可以接受的,但在嘈杂或非本地语音上的性能会下降。已经开发了许多算法来计算基频,它们在干净的语音上具有相当好的性能,但同样,在嘈杂的条件下性能会下降。然而,在某些应用中,例如在计算机辅助语言学习中,韵律特征的相关性至关重要;事实上,对学习者发音的诊断质量在很大程度上取决于估计的韵律参数的准确性和可靠性。

本次演讲将考虑韵律特征的计算,展示自动方法的局限性,以及讨论计算这些特征的置信度的问题。然后讨论了韵律特征的作用以及如何处理它们以在某些任务中进行自动处理,例如话语粒子检测、情感表征、句型分类以及计算机辅助语言学习和在表达性语音合成中。

课程简介: The prosody of the speech signal conveys information over the linguistic content of the message: prosody structures the utterance, and also brings information on speaker’s attitude and speaker’s emotion. Duration of sounds, energy and fundamental frequency are the prosodic features. However, their automatic computation and usage are not obvious. Sound duration features are usually extracted from speech recognition results or from a force speech-text alignment. Although the resulting segmentation is usually acceptable on clean native speech data, performance degrades on noisy or not non-native speech. Many algorithms have been developed for computing the fundamental frequency, they lead to rather good performance on clean speech, but again, performance degrades in noisy conditions. However, in some applications, as for example in computer assisted language learning, the relevance of the prosodic features is critical; indeed, the quality of the diagnostic on the learner’s pronunciation will heavily depend on the precision and reliability of the estimated prosodic parameters. The talk will consider the computation of prosodic features, show the limitations of automatic approaches, and discuss the problem of computing confidence measures on such features. Then the talk with discuss the role of prosodic features and how they can be handled for automatic processing in some tasks such as the detection of discourse particles, the characterization of emotions, the classification of sentence modalities, as well as in computer assisted language learning and in expressive speech synthesis.
关 键 词: 语音信号; 韵律; 置信度
课程来源: 视频讲座网
数据采集: 2021-06-18:yumf
最后编审: 2021-06-18:yumf
阅读次数: 69