
Learning Consensus Opinion: Mining Data from a Labeling Game
主讲教师: Paul N. Bennett; Anton Mityagin; David Maxwell Chickering
开课单位: 微软公司
课程简介: In this paper, we consider the challenge of how to identify the consensus opinion of a set of users as to how the results for a query should be ranked. Once consensus rankings are identified for a set of queries, these rankings can serve for both evaluation and training of retrieval and learning systems. We present a novel approach to collecting user preferences over image-search results: we use a collaborative game in which players are rewarded for agreeing on which image result is best for a query. Our approach is distinct from other labeling games because we are able to elicit directly the preferences of interest with respect to image queries extracted from query logs. As a source of relevance judgments, this data provides a useful complement to click data. Furthermore, it is free of positional biases and does not carry the risk of frustrating users with non-relevant results associated with proposed mechanisms for debiasing clicks. We describe data collected over 35 days from a deployed version of this game that amounts to about 19 million expressed preferences between pairs. Finally, we present several approaches to modeling this data in order to extract the consensus rankings from the preferences and better sort the search results for targeted queries.
关 键 词: 图像搜索; 数据挖掘; 建模
