0


结构化预测的在线多核学习

Online MKL for Structured Prediction
课程网址: http://videolectures.net/nipsworkshops2010_martins_oms/  
主讲教师: André F. T. Martins
开课单位: 卡内基梅隆大学
开课时间: 2011-01-12
课程语种: 英语
中文简介:
结构化预测(SP)问题的特点是输出变量之间具有很强的相互依赖性,通常采用顺序结构、图形结构或组合结构[17,7]。获得好的预测器通常需要在特性/内核设计和调优(通常通过交叉验证完成)方面付出很大的努力。由于结构化预测器的识别训练速度很慢,特别是在大规模环境下,同时学习核函数是很有吸引力的。在多核学习(mkl,[8])中,内核是作为预先指定的基内核的线性组合来学习的。随着基于包装器的方法的出现,这个框架已经变得可扩展,在这种方法中,一个标准的学习问题(例如,SVM)在一个内部循环中被反复解决,直到达到规定的精度[14,11,6]。不幸的是,将这种方法扩展到大规模的SP会带来实际障碍:由于输出空间很大,内核矩阵和支持向量的数量也很大;而且,以批处理的形式处理内部问题变得困难,人们通常使用在线算法[12、3、13]。后者是快速学习者,但速度较慢的优化器[2];在早期停止的内部循环中使用它们可能会误导整个MKL优化。我们提出了一种独立的在线MKL算法,它直接利用大规模的权衡。该算法在次梯度和近端步之间迭代,具有重要的优点:(i)它简单、灵活,与MKL的稀疏变量和非稀疏变量兼容;(i i)它适用于SP;(i i i)它提供遗憾、收敛和泛化保证;(实验上,该算法产生接近最优解。ONS比基于包装器的方法(适应于SP)更快。我们使用以下两个SP实验测试台:手写文本识别的序列标记和自然语言依赖性分析。我们的方法的潜力表现在从数万个培训实例中学习内核的组合,在速度、准确性和可解释性方面取得令人鼓舞的结果。
课程简介: Structured prediction (SP) problems are characterized by strong interdependence among the output variables, usually with sequential, graphical, or combinatorial structure [17, 7]. Obtaining good predictors often requires a large effort in feature/kernel design and tuning (usually done via crossvalidation). Because discriminative training of structured predictors can be quite slow, specially in large scale settings, it is appealing to learn the kernel function simultaneously. In multiple kernel learning (MKL, [8]), the kernel is learned as a linear combination of prespecified base kernels. This framework has been made scalable with the advent of wrapper-based methods, in which a standard learning problem (e.g., an SVM) is repeatedly solved in an inner loop up to a prescribed accuracy [14, 11, 6]. Unfortunately, extending such methods to large-scale SP raises practical hurdles: since the output space is large, so are the kernel matrices, and the number of support vectors; moreover, it becomes prohibitive to tackle the inner problem in its batch form, and one usually resorts to online algorithms [12, 3, 13]. The latter are fast learners but slow optimizers [2]; using them in the inner loop with early stopping may misguide the overall MKL optimization. We propose instead a stand-alone online MKL algorithm which exploits the large-scale tradeoffs directly. The algorithm iterates between subgradient and proximal steps, and has important advantages:\\ (i) it is simple, flexible, and compatible with sparse and non-sparse variants of MKL,\\ (ii) it is adequate for SP,\\ (iii) it offers regret, convergence, and generalization guarantees.\\ Experimentally, the proposed algorithm yields near-optimal solutions faster than wrapper-based approaches (adapted to SP). We use the two following SP experimental testbeds: sequence labeling for handwritten text recognition, and natural language dependency parsing. The potential of our approach is shown in learning combinations of kernels from tens of thousands of training instances, with encouraging results in terms of speed, accuracy, and interpretability.
关 键 词: 结构化的输出; 计算机科学; 机器学习; 在线学习
课程来源: 视频讲座网
最后编审: 2020-05-30:张荧(课程编辑志愿者)
阅读次数: 44