0


批量归一化:通过减少内部协变量偏移加速深度网络训练

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
课程网址: http://videolectures.net/icml2015_ioffe_batch_normalization/  
主讲教师: Sergey Ioffe
开课单位: 谷歌
开课时间: 2015-12-05
课程语种: 英语
中文简介:
训练深度神经网络是复杂的,因为每一层的输入分布在训练过程中会随着前一层的参数变化而变化。这需要较低的学习率和仔细的参数初始化,从而降低了训练的速度,并使得训练饱和非线性模型非常困难。我们将这种现象称为内部协变量移位,并通过规范化层输入来解决该问题。我们的方法的优点在于将规范化作为模型体系结构的一部分,并为每个训练小批执行规范化。批处理归一化允许我们使用更高的学习率,并且在初始化时不那么小心,并且在某些情况下消除了Dropout的需要。应用于最先进的图像分类模型,批处理归一化以14倍的训练步骤实现了相同的精度,并以显著的优势击败了原始模型。使用批处理归一化网络的集成,我们改进了ImageNet分类的最佳发布结果:达到4.82\%的前5测试误差,超过了人工评分的准确性。
课程简介: Training Deep Neural Networks is complicated by the fact that the distribution of each layer’s inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a stateof-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82\% top-5 test error, exceeding the accuracy of human raters.
关 键 词: 深度神经网络; 参数初始化; 归一化网络
课程来源: 视频讲座网
数据采集: 2022-11-16:chenjy
最后编审: 2022-11-16:chenjy
阅读次数: 51