0


超越随机梯度下降

Beyond Stochastic Gradient Descent
课程网址: http://videolectures.net/roks2013_bach_optimization/  
主讲教师: Francis R. Bach
开课单位: INRIA-SIERRA项目团队
开课时间: 2013-08-26
课程语种: 英语
中文简介:
许多机器学习和信号处理问题传统上被视为凸优化问题。解决这些问题的一个常见困难是数据的大小,其中有许多观测值(“大n”),并且每个观测值都很大(“大p”)。在这种情况下,只传递一次数据的在线算法通常优于需要多次传递数据的批处理算法。在这次演讲中,我将展示几个最近的结果,表明在理想的in finite-data设置中,基于随机近似的在线学习算法应该是首选,但在实际的有限数据设置中,批处理和在线算法的适当组合会导致意想不到的行为,例如迭代成本类似于随机梯度下降的线性收敛率。(与Nicolas Le Roux, Eric Moulines和Mark Schmidt合作)
课程简介: Many machine learning and signal processing problems are traditionally cast as convex optimization problems. A common difficulty in solving these problems is the size of the data, where there are many observations ("large n") and each of these is large ("large p"). In this setting, online algorithms which pass over the data only once, are usually preferred over batch algorithms, which require multiple passes over the data. In this talk, I will present several recent results, showing that in the ideal in finite-data setting, online learning algorithms based on stochastic approximation should be preferred, but that in the practical finite-data setting, an appropriate combination of batch and online algorithms leads to unexpected behaviors, such as a linear convergence rate with an iteration cost similar to stochastic gradient descent. (joint work with Nicolas Le Roux, Eric Moulines and Mark Schmidt)
关 键 词: 随机梯度; 数据算法; 迭代成本
课程来源: 视频讲座网
数据采集: 2023-04-22:chenxin01
最后编审: 2023-05-18:chenxin01
阅读次数: 11