0


通过块正则化回归选择特征

Feature Selection via Block-Regularized Regression
课程网址: http://videolectures.net/cmulls08_kim_fsbrr/  
主讲教师: Seyoung Kim
开课单位: 卡内基梅隆大学
开课时间: 2008-10-21
课程语种: 英语
中文简介:
识别具有内部结构的非常高维特征空间中的共变因果元素,例如,具有多达数百万个线性有序特征的空间,如通常在诸如全基因组关联(WGA)映射之类的问题中遇到的,仍然是一个开放的问题。统计学习。我们提出了一种块正则化回归模型,用于高维空间中的稀疏变量选择,其中协变量是线性排序的,并且可能由于特征的空间或时间接近而受到局部统计联系(例如,块结构)。我们的目标是识别相关协变量的一小部分,其不仅来自排序中的随机位置,而且被分组为来自大量有序协变量的连续块。在特征和响应之间的典型线性回归框架之后,我们提出的模型对回归系数采用稀疏强制拉普拉斯先验,通过沿着特征序列的一阶马尔可夫过程来增强,该特征序列以耦合方式“激活”回归系数。我们描述了一种基于采样的学习算法,并证明了我们的方法在WGA下用于标记识别的模拟和生物数据的性能。
课程简介: Identifying co-varying causal elements in very high dimensional feature space with internal structures, e.g., a space with as many as millions of linearly ordered features, as one typically encounters in problems such as whole genome association (WGA) mapping, remains an open problem in statistical learning. We propose a block-regularized regression model for sparse variable selection in a high-dimensional space where the covariates are linearly ordered, and are possibly subject to local statistical linkages (e.g., block structures) due to spacial or temporal proximity of the features. Our goal is to identify a small subset of relevant covariates that are not merely from random positions in the ordering, but grouped as contiguous blocks from large number of ordered covariates. Following a typical linear regression framework between the features and the response, our proposed model employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a 1st-order Markovian process along the feature sequence that "activates" the regression coefficients in a coupled fashion. We describe a sampling-based learning algorithm and demonstrate the performance of our method on simulated and biological data for marker identification under WGA.
关 键 词: 特征空间; 块正则化回归模型; 全基因组关联
课程来源: 视频讲座网
最后编审: 2020-06-22:chenxin
阅读次数: 74