0


声音的分级峰值编码

Hierarchical spike coding of sound
课程网址: http://videolectures.net/machine_karklin_sound/  
主讲教师: Yan Karklin
开课单位: 纽约大学
开课时间: 2013-06-14
课程语种: 英语
中文简介:
我们开发了一种概率生成模型,用于通过两阶段层次表示多尺度的声事件结构。第一阶段由尖峰表示组成,该尖峰表示对具有精确定位的不同频率的稀疏内核集合的声音进行编码。第一阶段尖峰的粗略时间和频率统计结构由第二阶段尖峰表示编码,而精细规模统计规则由第一阶段内的循环相互作用编码。当适合语音数据时,该模型对声学特征进行编码,例如谐波叠加,扫描和频率调制,其可以被组合以表示复杂的声学事件。该模型还能够从更高级别的表示中合成声音,并且在去噪任务上提供了对小波阈值技术的显着改进。
课程简介: We develop a probabilistic generative model for representing acoustic event structure at multiple scales via a two-stage hierarchy. The first stage consists of a spiking representation which encodes a sound with a sparse set of kernels at different frequencies positioned precisely in time. The coarse time and frequency statistical structure of the first-stage spikes is encoded by a second stage spiking representation, while fine-scale statistical regularities are encoded by recurrent interactions within the first-stage. When fitted to speech data, the model encodes acoustic features such as harmonic stacks, sweeps, and frequency modulations, that can be composed to represent complex acoustic events. The model is also able to synthesize sounds from the higher-level representation and provides significant improvement over wavelet thresholding techniques on a denoising task.
关 键 词: 概率生成; 声事件; 稀疏内核
课程来源: 视频讲座网
最后编审: 2019-05-15:cwx
阅读次数: 71