谁能编辑什么预测编辑生存Can Who‑Edits‑What Predict Edit Survival |
|
课程网址: | http://videolectures.net/kdd2018_kristof_edit_survival/ |
主讲教师: | Victor Kristof |
开课单位: | 洛桑联邦理工学院,EPFL |
开课时间: | 2018-11-23 |
课程语种: | 英语 |
中文简介: | 随着在线同行生产系统贡献者的数量增加,预测用户所做的编辑是否最终对项目有益变得越来越重要。现有的解决方案要么依赖于用户信誉系统,要么由一个高度专业化的预测器组成,该预测器适合于特定的同行生产系统。在这项工作中,我们探索了解决方案空间中的一个不同点,它超越了用户声誉,但不涉及任何基于内容的编辑功能。我们将每个编辑视为编辑器和项目组件之间的游戏。我们假设编辑被接受的概率取决于编辑的技能、编辑组件的难度以及用户组件交互术语。我们的模型是广泛适用的,因为它只需要观察关于谁进行了编辑、编辑影响了什么以及编辑是否存在的数据。我们将我们的模型应用于维基百科和Linux内核,这是大规模对等生产系统的两个例子,我们试图了解它是否能够有效地预测编辑生存:在这两种情况下,我们都提供了积极的答案。我们的方法显著优于仅基于用户信誉的方法,并弥补了与使用基于内容的特征的专用预测工具之间的差距。它实现简单,计算成本低,此外,它使我们能够发现数据中有趣的结构。 |
课程简介: | As the number of contributors to online peer-production systems grows, it becomes increasingly important to predict whether the edits that users make will eventually be beneficial to the project. Existing solutions either rely on a user reputation system or consist of a highly specialized predictor that is tailored to a specific peer-production system. In this work, we explore a different point in the solution space that goes beyond user reputation but does not involve any content-based feature of the edits. We view each edit as a game between the editor and the component of the project. We posit that the probability that an edit is accepted is a function of the editor’s skill, of the difficulty of editing the component and of a user-component interaction term. Our model is broadly applicable, as it only requires observing data about who makes an edit, what the edit affects and whether the edit survives or not. We apply our model on Wikipedia and the Linux kernel, two examples of large-scale peer-production systems, and we seek to understand whether it can effectively predict edit survival: in both cases, we provide a positive answer. Our approach significantly outperforms those based solely on user reputation and bridges the gap with specialized predictors that use content-based features. It is simple to implement, computationally inexpensive, and in addition it enables us to discover interesting structure in the data. |
关 键 词: | 在线同行生产系统; 有效地预测编辑生存; 维基百科和Linux内核; 编辑组件的难度 |
课程来源: | 视频讲座网 |
数据采集: | 2023-01-29:cyh |
最后编审: | 2023-01-30:cyh |
阅读次数: | 19 |