
Multimodal Learning with Deep Boltzmann Machines
课程网址: http://videolectures.net/nips2012_salakhutdinov_multimodal_learni...  
主讲教师: Ruslan Salakhutdinov
开课单位: 卡内基梅隆大学
开课时间: 2013-01-16
课程语种: 英语
我们提出了一种用于学习多模态数据的年龄模型的Deep Boltzmann机器。我们展示了如何使用模型来提取多模式数据的有意义的表示。我们发现学习表示对于分类和信息检索任务是有用的,因此符合一些语义相似性的概念。该模型定义了多模态输入空间的概率密度。通过从过度数据模态的条件分布中抽样,可以在缺少某些数据模态时创建表示。由图像和文本组成的双模态数据的实验结果表明,Multimodal DBM可以学习图像和文本输入的联合空间的良好生成模型,这对于从单峰和多模查询中检索信息非常有用。我们进一步证明了我们的模型可以在判别性方面显着优于SVM和LDA。最后,我们将我们的模型与其他深度学习方法进行比较,包括自动编码器和深度信任网络,并表明它获得了显着的收益。
课程简介: We propose a Deep Boltzmann Machine for learning a generative model of multimodal data. We show how to use the model to extract a meaningful representation of multimodal data. We find that the learned representation is useful for classification and information retreival tasks, and hence conforms to some notion of semantic similarity. The model defines a probability density over the space of multimodal inputs. By sampling from the conditional distributions over each data modality, it possible to create the representation even when some data modalities are missing. Our experimental results on bi-modal data consisting of images and text show that the Multimodal DBM can learn a good generative model of the joint space of image and text inputs that is useful for information retrieval from both unimodal and multimodal queries. We further demonstrate that our model can significantly outperform SVMs and LDA on discriminative tasks. Finally, we compare our model to other deep learning methods, including autoencoders and deep belief networks, and show that it achieves significant gains.
关 键 词: 年龄模型; 多模式数据; 数据模态
课程来源: 视频讲座网
最后编审: 2019-09-07:lxf
阅读次数: 62