0


结合决策树和神经网络学习个人搜索排名

Combining Decision Trees and Neural Networks forLearning-to-Rank in Personal Search
课程网址: http://videolectures.net/kdd2019_li_qin_wang/  
主讲教师: Pan Li
开课单位: 伊利诺大学芝加哥分校
开课时间: 2020-03-02
课程语种: 英语
中文简介:
在过去的十年中,像LambdaMART这样的决策树(DT)一直是学习排序算法的最有效类型之一。它们通常与手工制作的致密特征(例如BM25分数)配合良好。最近,当有大量训练数据可用时,神经网络(NN)在直接利用稀疏和复杂特征(如查询和文档关键字)方面取得了令人印象深刻的结果。虽然在如何使用NN进行查询和文档之间的语义匹配方面有大量的工作,但将NN与DT进行比较以进行任务排序的一般学习的工作相对较少,因为在这些任务中,密集的特征也可用,DT可以达到最先进的性能。在本文中,我们研究了如何将DT和NN结合起来,在学习排名设置中有效地从双方获益。具体而言,我们将研究重点放在个人搜索上,其中点击被用作主要标签,具有无偏见的学习排名算法,并且大量的训练数据很容易获得。我们的组合方法基于集成学习。我们使用两个最大的个人搜索服务:Gmail搜索和GoogleDrive搜索,设计了12个变体,并基于两个方面对其进行比较,即排名效率和部署方便性。我们表明,直接应用现有的集成方法无法实现这两个方面。因此,我们设计了一种新的方法,使用NN通过升压来补偿DT。我们表明,这种方法不仅易于部署,而且具有可比性或更好的排名精度。
课程简介: Decision Trees (DTs) like LambdaMART have been one of the most effective types of learning-to-rank algorithms in the past decade. They typically work well with hand-crafted dense features (e.g., BM25 scores). Recently, Neural Networks (NNs) have shown impressive results in leveraging sparse and complex features (e.g., query and document keywords) directly when a large amount of training data is available. While there is a large body of work on how to use NNs for semantic matching between queries and documents, relatively less work has been conducted to compare NNs with DTs for general learning-to-rank tasks, where dense features are also available and DTs can achieve state-of-the-art performance. In this paper, we study how to combine DTs and NNs to effectively bring the benefits from both sides in the learning-to-rank setting. Specifically, we focus our study on personal search where clicks are used as the primary labels with unbiased learning-to-rank algorithms and a significantly large amount of training data is easily available. Our combination methods are based on ensemble learning. We design 12 variants and compare them based on two aspects, ranking effectiveness and ease-of-deployment, using two of the largest personal search services: Gmail search and Google Drive search. We show that direct application of existing ensemble methods can not achieve both aspects. We thus design a novel method that uses NNs to compensate DTs via boosting. We show that such a method is not only easier to deploy, but also gives comparable or better ranking accuracy.
关 键 词: 神经网络学习; 结合决策树; 个人搜索排名; 手工制作的致密特征
课程来源: 视频讲座网
数据采集: 2022-09-15:cyh
最后编审: 2022-09-19:cyh
阅读次数: 52