首页人工智能
   首页机械学
0


通过高斯混合模型和贝叶斯信息准则从不一致和不可靠的注释者学习

Learning from Inconsistent and Unreliable Annotators by a Gaussian Mixture Model and Bayesian Information Criterion
课程网址: http://videolectures.net/ecmlpkdd2011_obradovic_annotators/  
主讲教师: Zoran Obradovic
开课单位: 天普大学
开课时间: 2011-11-30
课程语种: 英语
中文简介:
从多个注释器中进行监督学习是机器学习和数据挖掘中一个越来越重要的问题。本文针对这一问题的概率方法, 在注释器不仅不可靠, 而且根据数据的不同而具有不同的性能时。该方法采用高斯混合模型 (gmm) 和贝叶斯信息准则 (bic) 来寻找适合的模型来逼近实例的分布。然后交替给出隐藏真实标签的最大后验 (map) 估计和多个注释器质量的最大似然 (ml) 估计。情绪语音分类和 casp9 蛋白紊乱预测任务的实验表明, 与多数投票基线和以前的数据独立方法相比, 该方法的性能有所改善。此外, 该方法还为每个高斯组件的单个注释器性能提供了更准确的估计, 从而为理解每个注释器的行为铺平了道路。
课程简介: Supervised learning from multiple annotators is an increasingly important problem in machine leaning and data mining. This paper develops a probabilistic approach to this problem when annotators are not only unreliable, but also have varying performance depending on the data. The proposed approach uses a Gaussian mixture model (GMM) and Bayesian information criterion (BIC) to find the fittest model to approximate the distribution of the instances. Then the maximum a posterior (MAP) estimation of the hidden true labels and the maximum-likelihood (ML) estimation of quality of multiple annotators are provided alternately. Experiments on emotional speech classification and CASP9 protein disorder prediction tasks show performance improvement of the proposed approach as compared to the majority voting baseline and a previous data-independent approach. Moreover, the approach also provides more accurate estimates of individual annotators performance for each Gaussian component, thus paving the way for understanding the behaviors of each annotator.
关 键 词: 机器学习; 高斯过程; 贝叶斯
课程来源: 视频讲座网
最后编审: 2020-06-06:魏雪琼(课程编辑志愿者)
阅读次数: 65