
Turning Down the Noise in the Blogosphere
课程网址: http://videolectures.net/kdd09_veda_tdnb/  
主讲教师: Gaurav Veda
开课单位: 卡内基梅隆大学
开课时间: 2009-09-14
课程语种: 英语
近年来, 博客圈每天发布的帖子数量大幅增加, 迫使用户应对信息超载。因此, 指导用户度过这一信息泛滥的任务变得至关重要。为了解决这一问题, 我们提出了一个有原则的方法, 以挑选一套最涵盖博客圈重要故事的职位。 我们定义了一个简单而优雅的覆盖概念, 并将其形式化为一个子模块化优化问题, 为此, 我们可以有效地计算一个近乎最优的解决方案。此外, 由于人们的兴趣各不相同, 理想的覆盖算法应纳入用户偏好, 以便根据个人口味调整选定的职位。我们通过提供适当的用户交互模型并为该任务正式确定在线学习框架来定义学习个性化覆盖函数的问题。然后, 我们提供了一个无遗憾的算法, 可以快速了解用户的偏好从有限的反馈。我们在真实的博客数据上广泛评估我们的覆盖范围和个性化算法。用户研究的结果表明, 我们的简单覆盖算法与最受欢迎的博客聚合网站一样出色, 包括谷歌博客搜索、雅虎 buzz 和 digg。此外, 我们还通过经验证明, 我们的算法能够成功地适应用户的偏好。我们相信, 我们的技术, 特别是个性化技术, 可以显著减少信息超载。
课程简介: In recent years, the blogosphere has experienced a substantial increase in the number of posts published daily, forcing users to cope with information overload. The task of guiding users through this flood of information has thus become critical. To address this issue, we present a principled approach for picking a set of posts that best covers the important stories in the blogosphere. We define a simple and elegant notion of coverage and formalize it as a submodular optimization problem, for which we can efficiently compute a near-optimal solution. In addition, since people have varied interests, the ideal coverage algorithm should incorporate user preferences in order to tailor the selected posts to individual tastes. We define the problem of learning a personalized coverage function by providing an appropriate user-interaction model and formalizing an online learning framework for this task. We then provide a no-regret algorithm which can quickly learn a users preferences from limited feedback. We evaluate our coverage and personalization algorithms extensively over real blog data. Results from a user study show that our simple coverage algorithm does as well as most popular blog aggregation sites, including Google Blog Search, Yahoo! Buzz, and Digg. Furthermore, we demonstrate empirically that our algorithm can successfully adapt to user preferences. We believe that our technique, especially with personalization, can dramatically reduce information overload.
关 键 词: 博客圈; 算法; 特征意义语料库
课程来源: 视频讲座网
最后编审: 2020-06-15:wuyq
阅读次数: 56