0


差异私人推荐系统:建立保密到Netflix的奖竞争者

Differentially Private Recommender Systems: Building Privacy into the Netflix Prize Contenders
课程网址: http://videolectures.net/kdd09_mcsherry_dprsbpin/  
主讲教师: Frank McSherry
开课单位: 微软公司
开课时间: 2009-09-14
课程语种: 英语
中文简介:
我们考虑从集体用户行为生成建议的问题, 同时为这些用户提供隐私保障。具体而言, 我们考虑 netflix 奖数据集及其领先的算法, 适用于差异隐私的框架。与以前的隐私工作有关加密保护建议的计算不同, 差异化隐私限制计算的方式排除了对基础记录的任何推断。这类算法必然会引入不确定性---即噪声---计算、交易准确性的隐私。我们发现, netflix 奖竞赛中的几种领先方法可以进行调整, 以提供不同的隐私, 而不会显著降低其准确性。为了调整这些算法, 我们将它们明确地分为两部分, 一个聚合学习阶段 (可在不同隐私保障下执行) 和一个单独推荐阶段 (使用所学到的相关性和个人数据)提供个性化的建议。这些调整并非微不足道, 既涉及对校准噪声算法的每记录灵敏度的仔细分析, 也涉及新的后处理步骤, 以减轻这种噪声的影响。我们测量了这些调整中准确性和隐私之间的经验权衡, 并发现我们可以提供不平凡的正式隐私保障, 同时仍然优于 cinematch 基准 netflix 提供的。
课程简介: We consider the problem of producing recommendations from collective user behavior while simultaneously providing guarantees of privacy for these users. Specifically, we consider the Netflix Prize data set, and its leading algorithms, adapted to the framework of differential privacy. Unlike prior privacy work concerned with cryptographically securing the computation of recommendations, differential privacy constrains a computation in a way that precludes any inference about the underlying records from its output. Such algorithms necessarily introduce uncertainty---i.e., noise---to computations, trading accuracy for privacy. We find that several of the leading approaches in the Netflix Prize competition can be adapted to provide differential privacy, without significantly degrading their accuracy. To adapt these algorithms, we explicitly factor them into two parts, an aggregation/learning phase that can be performed with differential privacy guarantees, and an individual recommendation phase that uses the learned correlations and an individual's data to provide personalized recommendations. The adaptations are non-trivial, and involve both careful analysis of the per-record sensitivity of the algorithms to calibrate noise, as well as new post-processing steps to mitigate the impact of this noise. We measure the empirical trade-off between accuracy and privacy in these adaptations, and find that we can provide non-trivial formal privacy guarantees while still outperforming the Cinematch baseline Netflix provides.
关 键 词: 计算机科学; 机器学习; 半监督学习
课程来源: 视频讲座网
最后编审: 2020-06-20:zyk
阅读次数: 112