0


大规模在线贝叶斯推荐

Large Scale Online Bayesian Recommendations
课程网址: http://videolectures.net/www09_stern_lsobr/  
主讲教师: Thore Graepel; David Stern; Ralf Herbrich
开课单位: 微软公司
开课时间: 2009-05-20
课程语种: 英语
中文简介:
我们提出了一个概率模型,用于为Web服务的用户生成个性化的项目建议。系统利用用户和项目元数据形式的内容信息,结合以前用户行为中的协同过滤信息,预测项目对用户的价值。用户和项目由特征向量表示,特征向量映射到一个低维“特征空间”,在该空间中,相似性通过内部产品来度量。可以从不同类型的反馈中对模型进行培训,以了解用户项首选项。在这里,我们提出了三个备选方案:直接观察每个用户对某些项目的绝对评级,观察二元偏好(喜欢/不喜欢)和观察用户特定尺度上的一组有序评级。通过期望传播(EP)和变分信息传递相结合的近似信息传递,实现了有效的推理。我们还包括一个动态模型,它允许一个物品的流行度、一个用户的品味或一个用户的个人评级尺度随时间而变化。通过使用假设密度滤波(ADF)进行训练,该模型只需要通过一次训练数据。这是一种在线学习算法,能够逐步考虑新数据,因此系统可以立即反映最新的用户偏好。我们评估了算法在movielens和netflix数据集上的性能,这些数据集分别包含~1000000和~100000000个等级。这表明,使用在线ADF方法对模型进行培训可以获得最先进的性能,如果可以通过对培训数据执行多次EP传递来获得计算资源,则可以选择进一步提高性能。
课程简介: We present a probabilistic model for generating personalised recommendations of items to users of a web service. The system makes use of content information in the form of user and item meta data in combination with collaborative filtering information from previous user behavior in order to predict the value of an item for a user. Users and items are represented by feature vectors which are mapped into a low-dimensional `trait space' in which similarity is measured in terms of inner products. The model can be trained from different types of feedback in order to learn user-item preferences. Here we present three alternatives: direct observation of an absolute rating each user gives to some items, observation of a binary preference (like/ don't like) and observation of a set of ordinal ratings on a user-specific scale. Efficient inference is achieved by approximate message passing involving a combination of Expectation Propagation (EP) and Variational Message Passing. We also include a dynamics model which allows an items popularity, a user's taste or a user's personal rating scale to drift over time. By using Assumed-Density Filtering (ADF) for training, the model requires only a single pass through the training data. This is an on-line learning algorithm capable of incrementally taking account of new data so the system can immediately reflect the latest user preferences. We evaluate the performance of the algorithm on the MovieLens and Netflix data sets consisting of ~1,000,000 and ~100,000,000 ratings respectively. This demonstrates that training the model using the on-line ADF approach yields state-of-the-art performance with the option of improving performance further if computational resources are available by performing multiple EP passes over the training data.
关 键 词: 概率模型; 特征向量; 期望传播; 密度滤波
课程来源: 视频讲座网
最后编审: 2020-05-15:chenxin
阅读次数: 36