0


多层网络损失面分布图

The landscape of the loss surfaces of multilayer networks
课程网址: http://videolectures.net/colt2015_choromanska_multilayer_networks...  
主讲教师: Anna Choromanska
开课单位: 视频讲座网
开课时间: 2015-08-20
课程语种: 英语
中文简介:
在过去几年里,深度学习在图像和语音识别或自然语言处理等应用领域重新引起了人们的兴趣。深度学习的绝大多数实际应用关注监督学习,其中使用随机梯度下降最小化监督损失函数。然而,对于这种高度非凸损失函数的性质,如它的景观和临界点(极大值、极小值和鞍点)的行为,以及为什么大型网络和小型网络能够获得截然不同的实际性能的原因,人们知之甚少。直到最近才表明,自旋-玻璃理论的新结果可能通过建立神经网络的损失函数和球形自旋-玻璃模型的哈密顿量之间的联系,为这些问题提供了潜在的解释。这两个模型之间的联系依赖于一些可能不现实的假设,然而经验证据表明,这种联系可能存在于现实中。我们提出的问题是,是否有可能放弃其中一些假设,从而在两个模型之间建立更强的联系。
课程简介: Deep learning has enjoyed a resurgence of interest in the last few years for such applications as image and speech recognition, or natural language processing. The vast majority of practical applications of deep learning focus on supervised learning, where the supervised loss function is minimized using stochastic gradient descent. The properties of this highly non-convex loss function, such as its landscape and the behavior of critical points (maxima, minima, and saddle points), as well as the reason why large- and small-size networks achieve radically different practical performance, are however very poorly understood. It was only recently shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spin-glass models. The connection between both models relies on a number of possibly unrealistic assumptions, yet the empirical evidence suggests that the connection may exist in real. The question we pose is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.
关 键 词: 自然语言; 深度学习; 监督学习; 随机梯度
课程来源: 视频讲座网
数据采集: 2022-12-02:chenxin01
最后编审: 2022-12-02:chenxin01
阅读次数: 29