0


用于稀疏在线学习的频率感知截断方法

Frequency-aware Truncated methods for Sparse Online Learning
课程网址: http://videolectures.net/ecmlpkdd2011_oiwa_learning/  
主讲教师: Hidekazu Oiwa
开课单位: 东京大学
开课时间: 2011-11-29
课程语种: 日语
中文简介:
具有L1正则化的在线监督学习最近受到关注,因为与批量学习方法相比,它通常需要更少的计算时间和更小的复杂空间。然而,在在线设置中使用的简单L1正则化方法具有副作用,即稀有特征往往被截断超过必要。实际上,在许多应用中,特征频率是高度偏斜的。我们基于先前的线性在线学习设置中的损失最小化更新,开发了一系列新的L1正则化方法。我们的方法可以以与先前工作相同的计算成本和收敛速度来识别和保留低频发生但信息特征。此外,我们将我们的方法与累积惩罚模型相结合,以在噪声数据上得出更稳健的模型。我们将方法应用于多个数据集,并根据经验评估算法的性能。实验结果表明,我们的频率感知截断模型提高了预测精度。
课程简介: Online supervised learning with L1-regularization has gained attention recently because it generally requires less computational time and a smaller space of complexity than batch-type learning methods. However, a simple L1-regularization method used in an online setting has the side effect that rare features tend to be truncated more than necessary. In fact, feature frequency is highly skewed in many applications. We developed a new family of L1-regularization methods based on the previous updates for loss minimization in linear online learning settings. Our methods can identify and retain low-frequency occurrence but informative features at the same computational cost and convergence rate as previous works. Moreover, we combined our methods with a cumulative penalty model to derive more robust models over noisy data. We applied our methods to several datasets and empirically evaluated the performance of our algorithms. Experimental results showed that our frequency-aware truncated models improved the prediction accuracy.
关 键 词: 正则化; 在线监督学习; 稀有特征
课程来源: 视频讲座网
最后编审: 2019-04-03:lxf
阅读次数: 65