0


现实世界”的网络搜索的问题

The "Real World" Web Search Problem
课程网址: http://videolectures.net/mmdss07_glover_trw/  
主讲教师: Eric Glover
开课单位:
开课时间: 2007-12-03
课程语种: 英语
中文简介:
有许多论文提出了解决与网络搜索相关的挑战的方法,例如相关性和排名,查询处理和分类。不幸的是,尽管有统计学上显着的实验结果,但许多这些方法在大规模商业环境中都是无效的。为了帮助弥合学术和商业环境之间的这种差距,本讲座研究了大规模商业搜索引擎的组成部分,然后提出了研究人员在这一领域遇到的各类问题 - 偏见;关于统计数据,用户,查询或网页内容的错误或不同的假设;数据不足或缺失;与评价和目标有关的不一致;和政策或外部因素,包括资源限制。本讲座使用真实的故事和个人经验,说明了这些问题的例子,以及一些处理或减少其后果或影响的方法。除了问题类别之外,网络的一些基本属性在进行实验或定义问题时通常不被充分考虑,从而导致不切实际的实验或目标。即使在搜索引擎中,忽略关键属性(例如用户和网络的非平稳性)也可能导致无效的评估,甚至可能导致子系统出现故障。幸运的是,非常简单的方法往往是非常有效的。这个问题有助于了解商业搜索引擎的工作方式,他们面临的问题,有效的解决方案需求,以及如何更改评估和问题定义以更有效地预测商业环境中的成功 - 同时仍保留利益研究员。
课程简介: There are numerous papers which present methods to address web-search related challenges such as relevance and ranking, query processing, and classi cation. Unfortunately, many of these methods are ine ective in a large-scale commer- cial setting, despite statistically signi cant experimental results. To help bridge this gap between academic and commercial settings, this lecture examines the components of large-scale commercial search engines, then proposes ve classes of problems encountered by researchers in this area - biases; bad or di erent assumptions about statistics, users, queries or web contents; insucient or miss- ing data; inconsistencies related to evaluations and objectives; and policies or external factors, including resource limitations. Using real stories and personal experiences, the lecture illustrates examples of these problems, along with a few proposed approaches to deal with or reduce their consequences or e ects. In addition to the classes of problems, there are several fundamental prop- erties of the web that are often not considered suciently when performing experiments or de ning problems, resulting in unrealistic experiments or ob- jectives. Even within a search engine, overlooking key properties such as the non-stationarity of the users and the web, can result in ine ective evaluations, and may even lead to failed subsystems. Fortunately, very simple approaches can often be highly e ective. This lec- ture helps put context on how commercial search engines work, what problems they face, what e ective solutions require, and how evaluations and problem de nitions could be changed to more e ectively predict success in a commercial setting - while still retaining interest of researchers.
关 键 词: 网络搜索; 统计数据; 商业搜索引擎
课程来源: 视频讲座网
最后编审: 2020-06-29:yumf
阅读次数: 53