条件随机场中半监督主动学习的几个方面Aspects of Semi-Supervised and Active Learning in Conditional Random Fields |
|
课程网址: | http://videolectures.net/ecmlpkdd2011_sokolovska_semisupervised/ |
主讲教师: | Nataliya Sokolovska |
开课单位: | 麦考瑞大学 |
开课时间: | 2011-11-30 |
课程语种: | 英语 |
中文简介: | 条件随机场是结构化输出预测的现有技术方法之一,并且该模型已被用于各种现实世界问题。监督分类是昂贵的,因为生产标记数据通常是昂贵的。未标记的数据相对便宜,但如何使用呢?未标记的数据可用于估计观测的边际概率,我们在工作中利用这一想法。将未标记数据和观察概率引入纯粹的判别模型是一项具有挑战性的任务。我们考虑将最近提出的半监督标准外推到条件随机场模型,并显示其缺点。我们讨论了边际概率的替代使用,并提出了基于配额抽样的基于池的主动学习方法。我们在合成以及标准自然语言数据集上进行实验,并且我们表明所提出的配额抽样主动学习方法是有效的。 |
课程简介: | Conditional random fields are among the state-of-the art approaches to structured output prediction, and the model has been adopted for various real-world problems. The supervised classification is expensive, since it is usually expensive to produce labelled data. Unlabeled data are relatively cheap, but how to use it? Unlabeled data can be used to estimate marginal probability of observations, and we exploit this idea in our work. Introduction of unlabeled data and of probability of observations into a purely discriminative model is a challenging task. We consider an extrapolation of a recently proposed semi-supervised criterion to the model of conditional random fields, and show its drawbacks. We discuss alternative usage of the marginal probability and propose a pool-based active learning approach based on quota sampling. We carry out experiments on synthetic as well as on standard natural language data sets, and we show that the proposed quota sampling active learning method is efficient. |
关 键 词: | 条件随机场; 监督分类; 边际概率 |
课程来源: | 视频讲座网 |
最后编审: | 2019-04-03:lxf |
阅读次数: | 98 |