0


从搜索会话中挖掘潜在的查询功能

Mining Broad Latent Query Aspects from Search Sessions
课程网址: http://videolectures.net/kdd09_punera_mblqass/  
主讲教师: Kunal Punera
开课单位: 雅虎公司
开课时间: 信息不详。欢迎您在右侧留言补充。
课程语种: 英语
中文简介:
搜索查询通常非常短,这意味着它们常常是未指定的,或者具有用户没有想到的感觉。广泛的潜在查询方面是一组关键字,它们简洁地表示一种特定的意义,或一种特定的信息需求,这有助于用户重新编写此类查询。我们从历史搜索会话日志中发现的查询重构中提取了如此广泛的潜在方面。我们提出了一个框架,在该框架下,提取这类广泛的潜在方面的问题可以简化为在系统可以存储的方面总数和响应任何给定查询时可以显示的方面数量的约束下优化形式目标函数的问题。我们提供的算法可以找到一组好的方面,也可以选择与任何查询匹配的$K$方面。现实世界搜索引擎日志上的经验结果显示,与使用单个关键字重新表述的强大基线相比,获得了显著的收益:分别从人工判断的准确性和点击率数据方面获得14美元和23美元的收益,从预测的类似查询方面的一致性方面获得约20美元的收益。这说明了广泛查询方面的重要性,以及提取它们的算法的有效性。
课程简介: Search queries are typically very short, which means they are often underspecified or have senses that the user did not think of. A broad latent query aspect is a set of keywords that succinctly represents one particular sense, or one particular information need, that can aid users in reformulating such queries. We extract such broad latent aspects from query reformulations found in historical search session logs. We propose a framework under which the problem of extracting such broad latent aspects reduces to that of optimizing a formal objective function under constraints on the total number of aspects the system can store, and the number of aspects that can be shown in response to any given query. We present algorithms to find a good set of aspects, and also to pick the best $k$ aspects matching any query. Empirical results on real-world search engine logs show significant gains over a strong baseline that uses single-keyword reformulations: a gain of $14\%$ and $23\%$ in terms of human-judged accuracy and click-through data respectively, and around $20\%$ in terms of consistency among aspects predicted for "similar" queries. This demonstrates both the importance of broad query aspects, and the efficacy of our algorithms for extracting them.
关 键 词: 目标函数; 搜索引擎; 算法有效性
课程来源: 视频讲座网
最后编审: 2020-01-13:chenxin
阅读次数: 32