0


基于优化的动态批模式主动学习框架

An Optimization Based Framework for Dynamic Batch Mode Active Learning
课程网址: http://videolectures.net/nipsworkshops2010_chakraborty_obf/  
主讲教师: Shayok Chakraborty
开课单位: 亚利桑那州立大学
开课时间: 2011-01-13
课程语种: 英语
中文简介:
主动学习技术在减少人类对注释数据实例的努力以引入分类器方面已经普及。当面对大量未标记数据时,此类算法会自动选择显着和代表样本进行手动注释。最近提出了批处理模式主动学习方案,以同时选择一批数据实例,而不是在每次单个查询之后更新分类器。虽然数值优化策略似乎是解决这个问题的自然选择(通过选择一批点来确保给定的客观标准得到优化),但许多提议的方法都是基于贪婪的启发式算法。此外,所有现有工作的onbatch模式主动学习都假设批量大小是作为问题的输入给出的。在这项工作中,我们提出了一种新的基于优化的策略,以根据所讨论的特定数据流动态地确定批量大小以及要查询的特定点。我们在广泛使用的VidTIMIT和MBGC生物识别数据集上的结果证实了框架在任何批处理模式主动学习应用程序中适应性地识别批量大小和要为手动注释选择的特定数据点的功效。
课程简介: Active learning techniques have gained popularity in reducing human effort to annotate data instances for inducing a classifier. When faced with large quantities of unlabeled data, such algorithms automatically select the salient and representative samples for manual annotation. Batch mode active learning schemes have been recently proposed to select a batch of data instances simultaneously, rather than updating the classifier after every single query. While numerical optimization strategies seem a natural choice to address this problem (by selecting a batch of points to ensure that a given objective criterion is optimized), many of the proposed approaches are based on greedy heuristics. Also, all the existing work on batch mode active learning assume that the batch size is given as an input to the problem. In this work, we propose a novel optimization based strategy to dynamically decide the batch size as well as the specific points to be queried, based on the particular data stream in question. Our results on the widely used VidTIMIT and the MBGC biometric datasets corroborate the efficacy of the framework to adaptively identify the batch size and the particular data points to be selected for manual annotation, in any batch mode active learning application.
关 键 词: 主动学习技术; 注释数据; 批处理模式
课程来源: 视频讲座网
最后编审: 2019-09-07:lxf
阅读次数: 50