0


模糊共识:主动学习理论在生物医学图像分析中的应用

Consensus of Ambiguity: Theory of Active Learning for Biomedical Image Analysis Applications
课程网址: http://videolectures.net/prib2010_doyle_cata/  
主讲教师: Scott Doyle
开课单位: 新泽西州立大学
开课时间: 2010-10-14
课程语种: 英语
中文简介:
监督分类器需要手动标记的训练样本来对未标记的对象进行分类。主动学习(AL)可用于选择性地仅标记“模糊”样本,确保每个标记的样本具有最大的信息量。这在手动标记昂贵的应用中是非常宝贵的,如在医学图像中,其中特定病理或解剖结构的注释通常仅由专业医师可能。现有的AL方法使用单一的歧义定义,但各个方法之间可能存在显着差异。在本文中,我们提出了对AL的模糊度(CoA)方法的一致性,其中仅选择在多个AL方案中一致地标记为模糊的样本用于注释。基于CoA的AL使用比随机学习(RL)更少的样本,同时利用各个AL方案之间的差异来有效地标记用于分类器训练的训练集。我们使用共识比来确定AL方法之间的方差,并且CoA方法用于训练三种不同医学图像数据集的分类器:100个前列腺组织病理学图像,18个前列腺DCE MRI患者研究和9个感兴趣的9,000个乳房组织病理学区域。耐心。我们使用概率促进树(PBT)将每个数据集分类为癌症或非癌症(前列腺),或高或低等级癌症(乳腺)。使用基于CoA的AL进行训练,并且在接收器操作特性曲线(AUC)下的准确度和面积方面进行评估。对于相同的训练集大小,CoA训练比RL的性能提高0.01±0.05%; RL需要大约5 10个样本才能与CoA的性能相匹配,这表明CoA是一种更有效的培训策略。
课程简介: Supervised classifiers require manually labeled training samples to classify unlabeled objects. Active Learning (AL) can be used to selectively label only “ambiguous” samples, ensuring that each labeled sample is maximally informative. This is invaluable in applications where manual labeling is expensive, as in medical images where annotation of specific pathologies or anatomical structures is usually only possible by an expert physician. Existing AL methods use a single definition of ambiguity, but there can be significant variation among individual methods. In this paper we present a consensus of ambiguity (CoA) approach to AL, where only samples which are consistently labeled as ambiguous across multiple AL schemes are selected for annotation. CoA-based AL uses fewer samples than Random Learning (RL) while exploiting the variance between individual AL schemes to efficiently label training sets for classifier training. We use a consensus ratio to determine the variance between AL methods, and the CoA approach is used to train classifiers for three different medical image datasets: 100 prostate histopathology images, 18 prostate DCE-MRI patient studies, and 9,000 breast histopathology regions of interest from 2 patients. We use a Probabilistic Boosting Tree (PBT) to classify each dataset as either cancer or non-cancer (prostate), or high or low grade cancer (breast). Trained is done using CoA-based AL, and is evaluated in terms of accuracy and area under the receiver operating characteristic curve (AUC). CoA training yielded between 0.01-0.05% greater performance than RL for the same training set size; approximately 5-10 more samples were required for RL to match the performance of CoA, suggesting that CoA is a more efficient training strategy.
关 键 词: 监督分类器; 手动标记; 概率促进树
课程来源: 视频讲座网
最后编审: 2019-09-14:lxf
阅读次数: 67