首页 → 经济学
首页 → 计算机科学技术
首页 → 计算机科学技术
高维数据的稳定性选择Stability Selection for High-Dimensional Data |
|
| 课程网址: | http://videolectures.net/sip08_buhlmann_ssfhd/ |
| 主讲教师: | Peter Bühlmann |
| 开课单位: | 苏黎世联邦理工学院 |
| 开课时间: | 2008-12-18 |
| 课程语种: | 英语 |
| 中文简介: | 尽管在过去5年中取得了显着进步,但高维结构的估计仍然存在差异,例如图形建模,聚类分析或(广义)回归中的变量选择。主要问题包括:(i)选择适当数量的正规化; (ii)解决方案可能缺乏稳定性,并且对所选结构或一组选定变量的证据或重要性进行量化。我们从实际和理论的角度介绍了稳定性选择的新方法,该方法解决了高维结构估计的这两个主要问题。稳定性选择基于子采样结合(高维)选择算法。因此,该方法非常通用,并且具有非常广泛的适用性。稳定性选择为错误发现的一些错误率提供有限的样本控制,因此为结构估计或模型选择选择适当数量的正则化提供了透明的原则。也许更重要的是,结果通常对所选择的正则化量非常不敏感。稳定性选择的另一个特性是对预先指定的选择方法的经验和理论上的改进。我们证明随机Lasso稳定性选择将是模型选择一致,即使违反了原始Lasso方法的一致性所需的必要条件。我们使用真实和模拟数据证明了变量选择,高斯图形建模和聚类的稳定性选择。这是与Nicolai Meinshausen的合作。 |
| 课程简介: | Despite remarkable progress over the past 5 years, estimation of high- dimensional structure, such as in graphical modeling, cluster analysis or variable selection in (generalized) regression, remains difficult. Among the main problems are: (i) the choice of an appropriate amount of regularization; (ii) a potential lack of stability of a solution and quantification of evidence or significance of a selected structure or of a set of selected variables. We introduce the new method of stability selection which addresses these two ma jor problems for high-dimensional structure estimation, both from a practical and theoretical point of view. Stability selection is based on sub- sampling in combination with (high-dimensional)selection algorithms. As such, the method is extremely general and has a very wide range of ap- plicability. Stability selection provides finite sample control for some error rates of false discoveries and hence a transparent principle to choose a proper amount of regularization for structure estimation or model selection. Maybe even more importantly, results are typically remarkably insensitive to the chosen amount of regularization. Another property of stability selection is the empirical and theoretical improvement over pre-specified selection meth- ods. We prove for randomized Lasso that stability selection will be model selection consistent even if the necessary conditions needed for consistency of the original Lasso method are violated. We demonstrate stability selection for variable selection, Gaussian graphical modeling and clustering, using real and simulated data. This is joint work with Nicolai Meinshausen. |
| 关 键 词: | 计算机科学; 机器学习; 统计学习; 社会科学; 经济; 计量经济学 |
| 课程来源: | 视频讲座网 |
| 最后编审: | 2020-06-18:dingaq |
| 阅读次数: | 167 |
