0


通信——非线性核机的高效分布式块最小化

Communication­-Efficient Distributed Block Minimization for Nonlinear Kernel Machines
课程网址: http://videolectures.net/kdd2017_hsieh_kernel_machines/  
主讲教师: Cho-Jui Hsieh
开课单位: 视频讲座网
开课时间: 2017-10-09
课程语种: 英语
中文简介:
内核机通常在各种任务上产生优越的预测性能;然而,它们面临着严峻的计算挑战。在本文中,我们展示了如何克服加速内核机的重要挑战。特别地,我们开发了一个并行块最小化框架来求解核机,包括核支持向量机和核逻辑回归。我们的框架通过形成Hessian矩阵的块对角近似,将问题分成更小的子问题。然后近似并行地求解子问题。在此基础上,建立了通信高效的直线搜索程序,以保证每次迭代时目标函数值的充分约简。我们证明了所提方法的全局线性收敛速度与一类广泛的子问题解,我们的分析涵盖了强凸函数和一些非强凸函数。我们将该算法应用于分布式系统上的大规模核支持向量机问题的求解,结果表明该算法比现有的并行求解算法有显著的改进。例如,在拥有50万个样本的covtype数据集上,我们的算法可以使用32台机器在20秒内获得96%准确率的近似解,而其他所有并行核支持向量机求解器需要超过2000秒才能获得95%准确率的解。此外,我们的算法可以扩展到非常大的数据集,例如拥有800万个样本和2000万个特征的kdd代数数据集。
课程简介: Kernel machines often yield superior predictive performance on various tasks; however, they suffer from severe computational challenges. In this paper, we show how to overcome the important challenge of speeding up kernel machines. In particular, we develop a parallel block minimization framework for solving kernel machines, including kernel SVM and kernel logistic regression. Our framework proceeds by dividing the problem into smaller subproblems by forming a block-diagonal approximation of the Hessian matrix. The subproblems are then solved approximately in parallel. After that, a communication efficient line search procedure is developed to ensure sufficient reduction of the objective function value at each iteration. We prove global linear convergence rate of the proposed method with a wide class of subproblem solvers, and our analysis covers strongly convex and some non-strongly convex functions. We apply our algorithm to solve large-scale kernel SVM problems on distributed systems, and show a significant improvement over existing parallel solvers. As an example, on the covtype dataset with half-a-million samples, our algorithm can obtain an approximate solution with 96% accuracy in 20 seconds using 32 machines, while all the other parallel kernel SVM solvers require more than 2000 seconds to achieve a solution with 95% accuracy. Moreover, our algorithm can scale to very large data sets, such as the kdd algebra dataset with 8 million samples and 20 million features.
关 键 词: 预测性能; 计算挑战; 求解核机; 直线搜索
课程来源: 视频讲座网
数据采集: 2022-11-20:chenxin01
最后编审: 2022-11-20:chenxin01
阅读次数: 26