0


收敛学习:不同的神经网络学习相同的表示吗?

Convergent Learning: Do different neural networks learn the same representations?
课程网址: http://videolectures.net/iclr2016_yosinski_convergent_learning/  
主讲教师: Jason Yosinski
开课单位: 康奈尔大学
开课时间: 2016-05-27
课程语种: 英语
中文简介:
最近在训练深度神经网络方面取得的成功促使人们积极研究在其中间层学习到的特征。这样的研究很困难,因为它需要理解由数百万个参数执行的非线性计算,但它有价值,因为它提高了我们理解当前模型和创建改进版本的能力。在这篇论文中,我们研究了神经网络表现出我们所说的收敛学习的程度,这是当多个网络学习的表示收敛到一组特征时,这些特征要么在网络之间单独相似,要么特征子集跨越相似的低维空间。我们提出了一种探测表征的具体方法:训练多个网络,然后在神经元或神经元组的层次上比较和对比它们各自的、习得的表征。我们开始研究这个问题,使用三种技术在特征级别上近似对齐不同的神经网络:在神经元之间进行一对一分配的二部匹配方法,查找一对多映射的稀疏预测方法,以及查找多对多映射的谱聚类方法。这个初步的研究揭示了神经网络的一些以前未知的特性,我们认为,未来对收敛学习问题的研究将产生更多的特性。这里描述的见解包括:(1)在多个网络中,一些特征是可靠地学习到的,而另一些特征则不是始终如一地学习到的;(2)单元学习跨越低维子空间,虽然这些子空间是多个网络共有的,但学习到的特定基向量不是;(3)表示代码显示了局部代码和跨多个单元的稍微(但不是完全)分布式代码之间的混合证据。
课程简介: Recent success in training deep neural networks have prompted active investigation into the features learned on their intermediate layers. Such research is difficult because it requires making sense of non-linear computations performed by millions of parameters, but valuable because it increases our ability to understand current models and create improved versions of them. In this paper we investigate the extent to which neural networks exhibit what we call convergent learning, which is when the representations learned by multiple nets converge to a set of features which are either individually similar between networks or where subsets of features span similar low-dimensional spaces. We propose a specific method of probing representations: training multiple networks and then comparing and contrasting their individual, learned representations at the level of neurons or groups of neurons. We begin research into this question using three techniques to approximately align different neural networks on a feature level: a bipartite matching approach that makes one-to-one assignments between neurons, a sparse prediction approach that finds one-to-many mappings, and a spectral clustering approach that finds many-to-many mappings. This initial investigation reveals a few previously unknown properties of neural networks, and we argue that future research into the question of convergent learning will yield many more. The insights described here include (1) that some features are learned reliably in multiple networks, yet other features are not consistently learned; (2) that units learn to span low-dimensional subspaces and, while these subspaces are common to multiple networks, the specific basis vectors learned are not; (3) that the representation codes show evidence of being a mix between a local code and slightly, but not fully, distributed codes across multiple units.
关 键 词: 参数执行; 非线性计算; 神经网络
课程来源: 视频讲座网
数据采集: 2022-11-11:chenjy
最后编审: 2022-11-11:chenjy
阅读次数: 37