0


通过离线初始化快速在线学习时间敏感的建议

Fast Online Learning through Offline Initialization for Time-sensitive Recommendation
课程网址: http://videolectures.net/kdd2010_chen_folt/  
主讲教师: Bee-Chung Chen
开课单位: 领英公司
开课时间: 2010-10-01
课程语种: 英语
中文简介:
大型动态项目池的推荐问题在内容优化,在线广告和网络搜索等Web应用程序中无处不在。尽管富项目元数据的可用性,项目级别的过度异质性通常需要在模型中包含项目特定的“因子”(或权重)。然而,由于估计项目因素是计算密集型的,因此对于时间敏感的推荐者问题提出了挑战,其中以在线方式快速学习新项目(例如,新闻文章,事件更新,推文)的因素是重要的。在本文中,我们提出了一种称为FOBFM(快速在线双线性因子模型)的新方法,通过在线回归快速学习项目特定因子。每个项目的在线回归可以独立执行,因此该过程快速,可扩展且易于并行化。然而,由于高维度,这些独立回归的收敛可能很慢。我们的方法的核心思想是使用大量历史数据来基于离线特征初始化在线模型,并学习可以有效降低维度的线性投影。我们通过在优化预测可能性的基础上求助于在线模型选择来估计线性预测的等级。通过大量实验,我们证明了我们的方法显着且统一地优于其他竞争方法,并且在预测对数似然方面获得了10 15%的相对提升,对于专有My Yahoo!的等级相关度量获得了200 300%。数据集;使用基于时间的列车/测试数据拆分,在基准MovieLens数据集上获得的均方根误差比先前最佳方法减少9%。
课程简介: Recommender problems with large and dynamic item pools are ubiquitous in web applications like content optimization, online advertising and web search. Despite the availability of rich item meta-data, excess heterogeneity at the item level often requires inclusion of item-specific ``factors'' (or weights) in the model. However, since estimating item factors is computationally intensive, it poses a challenge for time-sensitive recommender problems where it is important to rapidly learn factors for new items (e.g., news articles, event updates, tweets) in an online fashion. In this paper, we propose a novel method called FOBFM (Fast Online Bilinear Factor Model) to learn item-specific factors quickly through online regression. The online regression for each item can be performed independently and hence the procedure is fast, scalable and easily parallelizable. However, the convergence of these independent regressions can be slow due to high dimensionality. The central idea of our approach is to use a large amount of historical data to initialize the online models based on offline features and learn linear projections that can effectively reduce the dimensionality. We estimate the rank of our linear projections by taking recourse to online model selection based on optimizing predictive likelihood. Through extensive experiments, we show that our method significantly and uniformly outperforms other competitive methods and obtains relative lifts that are in the range of 10-15% in terms of predictive log-likelihood, 200-300% for a rank correlation metric on a proprietary My Yahoo! dataset; it obtains 9% reduction in root mean squared error over the previously best method on a benchmark MovieLens dataset using a time-based train/test data split.
关 键 词: 动态项目池; 内容优化; 网络搜索
课程来源: 视频讲座网
最后编审: 2020-01-13:chenxin
阅读次数: 64