主动学习生物医学引文筛选Active Learning for Biomedical Citation Screening |
|
课程网址: | http://videolectures.net/kdd2010_wallace_albc/ |
主讲教师: | Byron C. Wallace |
开课单位: | 布朗大学 |
开课时间: | 2010-10-01 |
课程语种: | 英语 |
中文简介: | 主动学习(AL)是一种越来越流行的策略,用于减少训练分类器所需的标记数据量,从而减少注释器的效果。我们描述了一个真实的,部署的AL应用于生物医学引用筛选问题,以便在塔夫茨循证实践中心进行系统评价。我们提出了一种新颖的主动学习策略,该策略利用专家提供的先验领域知识(具体地,标记的特征),并通过线性规划算法扩展该模型,以用于专家可以提供排序标记特征的情况。我们的方法在三个真实的系统评价数据集上优于现有的AL策略。我们认为评估必须特定于所考虑的情景。为此,我们为有限池场景提出了一个新的评估框架,其中主要目的是标记一组固定的示例,而不是简单地引出一个好的预测模型。我们使用医学决策理论中的方法来引出领域专家的误报和漏报的相对成本,构建综合专家偏好的分类绩效的效用度量。我们的发现表明,专家可以而且应该提供比实例标签更多的信息。除了在引文筛选问题上获得强有力的实证结果外,这项工作还概述了许多重要步骤,这些步骤可以从模拟主动学习转向为实际应用部署AL。 |
课程简介: | Active learning (AL) is an increasingly popular strategy for mitigating the amount of labeled data required to train classifi ers, thereby reducing annotator e ffort. We describe a real-world, deployed application of AL to the problem of biomedical citation screening for systematic reviews at the Tufts Evidence-based Practice Center. We propose a novel active learning strategy that exploits a priori domain knowledge provided by the expert (speci fically, labeled features) and extend this model via a Linear Programming algorithm for situations where the expert can provide ranked labeled features. Our methods outperform existing AL strategies on three real-world systematic review datasets. We argue that evaluation must be specifi c to the scenario under consideration. To this end, we propose a new evaluation framework for fi nite-pool scenarios, wherein the primary aim is to label a fixed set of examples rather than to simply induce a good predictive model. We use a method from medical decision theory for eliciting the relative costs of false positives and false negatives from the domain expert, constructing a utility measure of classi fication performance that integrates the expert preferences. Our fi ndings suggest that the expert can, and should, provide more information than instance labels alone. In addition to achieving strong empirical results on the citation screening problem, this work outlines many important steps for moving away from simulated active learning and toward deploying AL for real-world applications. |
关 键 词: | 主动学习; 生物医学引文; 医疗决策理论 |
课程来源: | 视频讲座网 |
最后编审: | 2020-06-06:魏雪琼(课程编辑志愿者) |
阅读次数: | 82 |