循环神经网络Recurrent Neural Networks |
|
课程网址: | http://videolectures.net/deeplearning2016_bengio_neural_networks/ |
主讲教师: | Yoshua Bengio |
开课单位: | 蒙特利尔大学 |
开课时间: | 2016-08-23 |
课程语种: | 英语 |
中文简介: | 这堂课将讨论递归神经网络,它是处理序列计算和建模序列的深度学习工具箱中的关键组成部分。它将首先解释如何计算梯度(通过考虑时间展开图),以及如何设计不同的体系结构来总结一个序列,在完全观察到的有向模型中通过祖先采样生成序列,或者学习将向量映射到序列,从序列映射到序列(长度相同或不同)或者一个向量的序列。长期依赖性的问题,它为什么会出现,以及有什么建议来减轻它,将是本讲座讨论的核心主题。这包括体系结构和初始化的变化,以及如何正确地描述体系结构的递归或前馈深度及其在展开图中创建快捷方式或快速传播渐变的能力。还将讨论关于最大可能性(教师强迫)培训的局限性以及在线学习的想法(不需要时间的支持)。 |
课程简介: | This lecture will cover recurrent neural networks, the key ingredient in the deep learning toolbox for handling sequential computation and modelling sequences. It will start by explaining how gradients can be computed (by considering the time-unfolded graph) and how different architectures can be designed to summarize a sequence, generate a sequence by ancestral sampling in a fully-observed directed model, or learn to map a vector to a sequence, a sequence to a sequence (of the same or different length) or a sequence to a vector. The issue of long-term dependencies, why it arises, and what has been proposed to alleviate it will be core subject of the discussion in this lecture. This includes changes in the architecture and initialization, as well as how to properly characterize the architecture in terms of recurrent or feedforward depth and its ability to create shortcuts or fast propagation of gradients in the unfolded graph. Open questions regarding the limitations of training by maximum likelihood (teacher forcing) and ideas towards towards making learning online (not requiring backprop through time) will also be discussed. |
关 键 词: | 循环神经网络; 体系结构; 序列 |
课程来源: | 视频讲座网 |
数据采集: | 2020-11-22:yxd |
最后编审: | 2020-11-22:yxd |
阅读次数: | 43 |