0


一种二元分类的集成生成和判别贝叶斯模型

An integrated generative and discriminative Bayesian model for binary classification
课程网址: http://videolectures.net/mlsb2010_harris_aig/  
主讲教师: Keith James Harris
开课单位: 格拉斯哥大学
开课时间: 2010-11-08
课程语种: 英语
中文简介:
样本数量要少得多。分析这些数据在统计学上具有挑战性,因为协变量之间高度相关,导致参数估计不稳定和预测不准确。为了解决这个问题,我们开发了一个统计模型,该模型通过高斯混合模型(而不是所有原始协变量)从数据中推断出少量的元协变量,通过Probit回归模型对样本进行分类。我们的模型的图形化概述如下图1所示。我们方法的新颖之处在于,我们的元协变量是在考虑预测结果相关性和预测因子间相关性的情况下形成的。这一观点的部分灵感来自于最近的实证研究,该研究表明,最佳预测性能通常对应于纯粹的生成性和纯粹的区分性分类方法之间的中间权衡[2]。与使用稀疏分类模型[1]相比,我们的主要优势在于,我们可以提取具有基本预测能力的更大的协变量子集,并将该子集划分为组,其中协变量相似。
课程简介: much smaller number of samples. Analysing such data is statistically challenging, as the covariates are highly correlated, which results in unstable parameter estimates and inaccurate prediction. To alleviate this problem, we have developed a statistical model which uses a small number of meta-covariates inferred from the data through a Gaussian mixture model, rather than all the original covariates, to classify samples via a probit regression model. A graphical overview of our model is presented in Figure 1 below. The novelty of our approach is that our meta-covariates are formed considering predictor-outcome correlations as well as inter-predictor correlations. This idea was partly inspired by recent empirical research that has shown that optimum predictive performance often corresponds to an intermediate trade-off between the purely generative and purely discriminative approaches to classification [2]. The main advantage over using a sparse classification model [1] is that we can extract a much larger subset of covariates with essential predictive power and partition this subset into groups, within which the covariates are similar.
关 键 词: 协变量; 元变量推断; 高斯混合模型; 稀疏分类模型
课程来源: 视频讲座网
最后编审: 2020-06-01:wuyq
阅读次数: 70