使用快速权重来改善持续的对比度差异Using Fast Weights to Improve Persistent Contrastive Divergence |
|
课程网址: | https://videolectures.net/videos/icml09_tieleman_ufw |
主讲教师: | Tijmen Tieleman |
开课单位: | 会议 |
开课时间: | 2009-08-26 |
课程语种: | 英语 |
中文简介: | 受限玻尔兹曼机最常用的学习算法是对比散度,它在一个数据点开始一个马尔可夫链,并只运行该链几次迭代,以获得模型下充分统计的廉价、低方差估计。Tieleman(2008)表明,通过使用一小部分持久的“幻想粒子”来估计模型的统计数据,可以实现更好的学习,这些粒子在每次权重更新后都不会重新初始化为数据点。在权重更新足够小的情况下,幻想粒子可以准确地表示平衡分布,但为了解释为什么该方法适用于更大的权重更新,有必要考虑权重更新和马尔可夫链之间的相互作用。我们发现,权重更新迫使马尔可夫链快速混合,并利用这一见解,我们开发了一种更快的混合链,该混合链使用一组辅助的“快速权重”在能源格局上实现临时覆盖。快速权重学习速度很快,但衰减也很快,对定义模型的正常能源格局没有贡献。 |
课程简介: | Tijmen Tieleman The most commonly used learning algorithm for restricted Boltzmann machines is contrastive divergence which starts a Markov chain at a data point and runs the chain for only a few iterations to get a cheap, low variance estimate of the sufficient statistics under the model. Tieleman (2008) showed that better learning can be achieved by estimating the model’s statistics using a small set of persistent ”fantasy particles” that are not reinitialized to data points after each weight update. With sufficiently small weight updates, the fantasy particles represent the equilibrium distribution accurately but to explain why the method works with much larger weight updates it is necessary to consider the interaction between the weight updates and the Markov chain. We show that the weight updates force the Markov chain to mix fast, and using this insight we develop an even faster mixing chain that uses an auxiliary set of ”fast weights” to implement a temporary overlay on the energy landscape. The fast weights learn rapidly but also decay rapidly and do not contribute to the normal energy landscape that defines the model. |
关 键 词: | 快速权重; 对比度差异; 学习算法 |
课程来源: | 视频讲座网 |
数据采集: | 2025-04-07:liyq |
最后编审: | 2025-04-07:liyq |
阅读次数: | 4 |