0


深度学习中的整合结构

Incorporating Structure in Deep Learning
课程网址: http://videolectures.net/iclr2016_urtasun_incoporating_structure/  
主讲教师: Raquel Urtasun
开课单位: 多伦多大学
开课时间: 2016-05-27
课程语种: 英语
中文简介:
深度学习算法试图使用由多个非线性变换组成的体系结构来建模数据的高层抽象。许多变体已经被提出并证明在计算机视觉、语音识别以及自然语言处理等广泛的应用中是非常成功的。在这篇演讲中,我将展示如何利用输出中的结构、损失函数以及学习的嵌入来使这些表示更强大。 实际应用中的许多问题都涉及到预测几个统计相关的随机变量。图形模型通常用于表示和利用输出依赖关系。然而,目前大多数的学习算法都假设模型的参数是对数线性的。在演讲的第一部分,我将展示各种算法,它们可以学习任意函数,同时利用输出依赖关系,将深度学习和图形模型统一起来。 深层神经网络的监督训练通常依赖于最小化交叉熵。然而,在许多领域中,我们对特定于应用程序域的度量标准表现良好很感兴趣。在演讲的第二部分,我将展示一种直接损失最小化方法来训练深层神经网络,这可以证明最小化任务损失。这通常是非平凡的,因为这些损失函数既不是光滑的,也不是可分解的,因此不能用标准的基于梯度的方法进行优化。我将演示这个通用框架在最大化平均进动的上下文中的适用性,平均进动是一种通常用于评估排名问题的结构化损失。 深度学习已经成为一种非常流行的学习单词、句子和/或图像嵌入的方法。神经网络嵌入在图像字幕、机器翻译和释义等任务中表现出良好的性能。在我演讲的最后一部分,我将展示如何利用单词、句子和图像的视觉语义层次的偏序结构来学习顺序嵌入。我将演示这些新的表示法在超长词预测和图像字幕检索方面的实用性。
课程简介: Deep learning algorithms attempt to model high-level abstractions of the data using architectures composed of multiple non-linear transformations. A multiplicity of variants have been proposed and shown to be extremely successful in a wide variety of applications including computer vision, speech recognition as well as natural language processing. In this talk I’ll show how to make these representations more powerful by exploiting structure in the outputs, the loss function as well as in the learned embeddings. Many problems in real-world applications involve predicting several random variables that are statistically related. Graphical models have been typically employed to represent and exploit the output dependencies. However, most current learning algorithms assume that the models are log linear in the parameters. In the first part of the talk I’ll show a variety of algorithms that can learn arbitrary functions while exploiting the output dependencies, unifying deep learning and graphical models. Supervised training of deep neural nets typically relies on minimizing cross-entropy. However, in many domains, we are interested in performing well on metrics specific to the application domain. In the second part of the talk I’ll show a direct loss minimization approach to train deep neural networks, which provably minimizes the task loss. This is often non-trivial, since these loss functions are neither smooth nor decomposable and thus are not amenable to optimization with standard gradient-based methods. I’ll demonstrate the applicability of this general framework in the context of maximizing average precession, a structured loss commonly used to evaluate ranking problems. Deep learning has become a very popular approach to learn word, sentence and/or image embeddings. Neural embeddings have shown great performance in tasks such as image captioning, machine translation and paraphrasing. In the last part of my talk I’ll show how to exploit the partial order structure of the visual semantic hierarchy over words, sentences and images to learn order embeddings. I’ll demonstrate the utility of these new representations for hypernym prediction and image-caption retrieval.
关 键 词: 深度学习; 算法; 图形模型
课程来源: 视频讲座网
数据采集: 2020-11-27:yxd
最后编审: 2020-11-27:yxd
阅读次数: 69