0


FDML:一个面向分布式特征的协同机器学习框架

FDML: A Collaborative Machine Learning Framework for Distributed Features
课程网址: http://videolectures.net/kdd2019_hu_niu_yang/  
主讲教师: Yaochen Hu
开课单位: 阿尔伯塔大学
开课时间: 2020-03-02
课程语种: 英语
中文简介:

当前大多数分布式机器学习系统都尝试通过使用数据并行体系结构来扩大模型训练的规模,该体系结构将不同样本的计算结果划分给不同的工人。我们从不同的动机研究分布式机器学习,其中关于相同样本(例如用户和对象)的信息由希望进行协作但不想彼此共享原始数据的多个奇偶校验拥有。针对此类特征分布式机器学习(FDML)问题,我们提出了一种异步随机梯度下降(SGD)算法,以从分布式特征中共同学习,并在有限异步下实现了理论收敛性。我们的算法不需要在各方之间共享原始特征甚至本地模型参数,从而保留了数据局部性。该系统还可以轻松合并差异隐私机制,以保留更高级别的隐私。我们在参数服务器体系结构中实现FDML系统,并通过对公共数据集a9a和5,000,000大型数据集进行的广泛实验,将系统与完全集中式学习(这违反数据局部性)和仅基于局部特征的学习进行比较腾讯的三个协作应用(包括腾讯MyApp,腾讯QQ浏览器和腾讯移动安全防护)记录和8700分散功能。实验结果表明,所提出的FDML系统可通过充分利用其他应用程序中的用户和项目功能,在很大程度上保留每个应用程序中功能的局部性和隐私性的情况下,显着增强腾讯MyApp中的应用程序推荐。

课程简介: Most current distributed machine learning systems try to scale up model training by using a data-parallel architecture that divides the computation for different samples among workers. We study distributed machine learning from a different motivation, where the information about the same samples, e.g., users and objects, are owned by several parities that wish to collaborate but do not want to share raw data with each other. We propose an asynchronous stochastic gradient descent (SGD) algorithm for such a feature distributed machine learning (FDML) problem, to jointly learn from distributed features, with theoretical convergence guarantees under bounded asynchrony. Our algorithm does not require sharing the original features or even local model parameters between parties, thus preserving the data locality. The system can also easily incorporate differential privacy mechanisms to preserve a higher level of privacy. We implement the FDML system in a parameter server architecture and compare our system with fully centralized learning (which violates data locality) and learning based on only local features, through extensive experiments performed on both a public data set a9a, and a large dataset of 5,000,000 records and 8700 decentralized features from three collaborating apps at Tencent including Tencent MyApp, Tecent QQ Browser and Tencent Mobile Safeguard. Experimental results have demonstrated that the proposed FDML system can be used to significantly enhance app recommendation in Tencent MyApp by leveraging user and item features from other apps, while preserving the locality and privacy of features in each individual app to a high degree.
关 键 词: FDML; 机器学习; 梯度下降算法; 合并差异; 隐私
课程来源: 视频讲座网
数据采集: 2020-04-30:zhouxj
最后编审: 2021-12-20:liyy
阅读次数: 69