PPDsparse:一种用于极限分类的并行原对偶稀疏方法PPDsparse: A Parallel Primal-Dual Sparse Method for Extreme Classification |
|
课程网址: | http://videolectures.net/kdd2017_huang_extreme_classification/ |
主讲教师: | Xiangru Huang |
开课单位: | 得克萨斯大学奥斯汀分校 |
开课时间: | 2017-10-09 |
课程语种: | 英语 |
中文简介: | 极限分类包括多类别或多标签预测,其中存在大量类别,并且与许多现实世界的应用(如文本和图像标记)越来越相关。在这种情况下,复杂度在类的数量上呈线性的标准分类方法变得难以处理,而在类之间强制实施结构约束(如低秩或树结构)以降低复杂度往往会为了效率而牺牲准确性。最近的PD稀疏方法通过一种在变量数量上是次线性的算法,通过利用特定损失函数(即最大裕度损失)中固有的原对偶稀疏性来解决这一问题。在这项工作中,我们将PD Sparse扩展到在大规模分布式环境中有效地并行化。通过引入可分离的损失函数,我们可以扩展训练,网络通信和空间效率与一对一方法相当,同时保持类数量的总体复杂性亚线性。在几个大型基准测试上,与100个核心集群上的现有并行或稀疏方法相比,我们提出的方法实现了与最先进技术相媲美的精度,同时将训练时间从几天减少到几十分钟。 |
课程简介: | Extreme Classification comprises multi-class or multi-label prediction where there is a large number of classes, and is increasingly relevant to many real-world applications such as text and image tagging. In this setting, standard classification methods, with complexity linear in the number of classes, become intractable, while enforcing structural constraints among classes (such as low-rank or tree-structure) to reduce complexity often sacrifices accuracy for efficiency. The recent PD-Sparse method addresses this via an algorithm that is sub-linear in the number of variables, by exploiting primal-dual sparsity inherent in a specific loss function, namely the max-margin loss. In this work, we extend PD-Sparse to be efficiently parallelized in large-scale distributed settings. By introducing separable loss functions, we can scale out the training, with network communication and space efficiency comparable to those in one-versus-all approaches while maintaining an overall complexity sub-linear in the number of classes. On several large-scale benchmarks our proposed method achieves accuracy competitive to the state-of-the-art while reducing the training time from days to tens of minutes compared with existing parallel or sparse methods on a cluster of 100 cores. |
关 键 词: | 极限分类; 对偶稀疏; 标签预测 |
课程来源: | 视频讲座网 |
数据采集: | 2023-06-19:chenxin01 |
最后编审: | 2023-06-19:chenxin01 |
阅读次数: | 33 |