0


神经网络中的短期记忆轨迹

Short term memory traces in neural networks
课程网址: http://videolectures.net/eccs08_gangulli_stmtinn/  
主讲教师: Surya Ganguli
开课单位: 斯坦福大学
开课时间: 2008-10-17
课程语种: 英语
中文简介:
诸如计划和决策之类的关键认知现象依赖于大脑在工作记忆中保存信息的能力。存在许多建议,用于在循环活动中维持这种记忆,这些活动是由递归神经网络的动态中的稳定的固定点吸引子引起的。然而,这样的固定点不能存储最近事件的时间序列。另一种相对较少探索的范例是在递归神经网络的瞬态响应中存储任意时间输入序列。这种范式提出了许多重要问题。这种瞬态存储器走线的持续时间是否有任何基本限制?这些限制如何取决于网络的大小?什么模式的突触连接在通用工作记忆任务中产生良好的性能?在存在噪声的情况下,这些迹线在多大程度上会降低?我们利用Fisher信息理论构建神经网络中记忆痕迹的新方法。通过将Fisher信息与动力系统理论相结合,我们可以找到针对一般线性神经网络的上述问题的精确答案。我们证明,任何网络中的内存跟踪的时间持续时间最多与网络中的神经元数量成比例。然而,即使网络中的神经元数量很大,通用循环网络中的存储器迹线也具有短的持续时间。具有良好工作记忆性能的网络必须具有(可能隐藏的)前馈架构,使得进入第一层的信号在从一层传播到下一层时被放大。我们证明了受饱和非线性影响的网络可以实现记忆痕迹,其持续时间与神经元数量的平方根成正比。这些网络具有前馈架构,具有不同的连接性。通过将激发扩散到每层中的许多神经元,这样的网络实现信号放大而不会使单个神经元饱和。
课程简介: Critical cognitive phenomena such as planning and decision making rely on the ability of the brain to hold information in working memory. Many proposals exist for the maintenance of such memories in persistent activity that arises from stable fixed point attractors in the dynamics of recurrent neural networks. However such fixed points are incapable of storing temporal sequences of recent events. An alternate, and relatively less explored paradigm, is the storage of arbitrary temporal input sequences in the transient responses of a recurrent neural network. Such a paradigm raises a host of important questions. Are there any fundamental limits on the duration of such transient memory traces? How do these limits depend on the size of the network? What patterns of synaptic connections yield good performance on generic working memory tasks? To what extent do these traces degrade in the presence of noise? We use the theory of Fisher information to construct of novel measure of memory traces in neural networks. By combining Fisher information with dynamical systems theory, we find precise answers to the above questions for general linear neural networks. We prove that the temporal duration of a memory trace in any network is at most proportional to the number of neurons in the network. However, memory traces in generic recurrent networks have a short duration even when the number of neurons in the network is large. Networks that exhibit good working memory performance must have a (possibly hidden) feedforward architecture, such that the signal entering at the first layer is amplified as it propagates from one layer to the next. We prove that networks subject to a saturating nonlinearity, can achieve memory traces whose duration is proportional to the square root of the number of neurons. These networks have a feedforward architecture with divergent connectivity. By spreading excitation across many neurons in each layer, such networks achieve signal amplification without saturating single neurons.
关 键 词: 递归神经网络; 瞬态存储器; 记忆痕迹
课程来源: 视频讲座网
最后编审: 2019-03-20:lxf
阅读次数: 48