
Stumping along a summary
课程网址: http://videolectures.net/explorationexploitation2011_salperwyck_u...  
主讲教师: Christophe Salperwyck; Tanguy Urvoy
开课单位: 法国电信研究
开课时间: 信息不详。欢迎您在右侧留言补充。
课程语种: 英语
课程简介: The methods we used to compete in the « Exploration & Exploitation » challenge are based on three layers. The first layer provides an online summary of the data stream for continuous and nominal data. Continuous data are handled using the Greenwald and Khanna online quantile summary which provides error guarantees for a fixed memory size. Nominal data are summarized with a hash-based counting structure. With these techniques we managed to build an accurate stream summary with a small memory footprint. The second layer uses the summary to build predictors. We explored several kinds of trees from simple decision stumps to deep multivariate ones. The stumps proved to be remarkably stable and efficient. But on the other hand, a progressive unfolding of the trees seemed to improve the model on the long run. For the last layer, we explored several combination strategies: online bagging, exponential weighting, linear ranker, etc. We observed a tradeoff between the expressiveness of the predictors and the power of the combination strategy but most strategies being difficult to tune, we went back to a simple averaging. It seems, from our experiments, that both the need for exploration and the click scarcity sharpens the need for very stable models.
关 键 词: 预测因子; 决策树桩; 指数加权; 组合策略
课程来源: 视频讲座网
最后编审: 2019-12-01:cwx
阅读次数: 29