0


含噪声和缺失数据的高维回归:非凸性的可证明保证

High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity
课程网址: http://videolectures.net/nips2011_loh_nonconvexity/  
主讲教师: Po-Ling Loh
开课单位: 加州大学
开课时间: 2012-01-25
课程语种: 英语
中文简介:
尽管预测问题的标准公式包括在i.i.d中完全观察到的无噪声数据。这样,许多应用程序涉及嘈杂的数据和/或丢失的数据,可能涉及依赖项。我们在高维稀疏线性回归的背景下研究这些问题,并针对嘈杂,丢失和/或依赖数据的情况提出新颖的估计器。许多用于处理嘈杂数据或丢失数据的标准方法(例如使用EM算法的方法)会导致优化问题,这些优化问题本来就是非凸的,因此很难为实用算法建立理论上的保证。尽管我们的方法还涉及优化非凸程序,但我们能够分析与任何全局最优方案相关的统计误差,并证明简单的投影梯度下降算法将在多项式时间内收敛到全局最小化器集合的一小部分。在统计方面,对于有噪声,缺失和/或相关数据的情况,我们提供非渐近边界,并且具有很高的概率。在计算方面,我们证明了在统计一致性所需的相同条件下,投影梯度下降算法将以几何速率收敛到近似全局最小化器。通过仿真对这些理论预测进行了说明,表明与预测的尺度一致。
课程简介: Although the standard formulations of prediction problems involve fully-observed and noiseless data drawn in an i.i.d. manner, many applications involve noisy and/or missing data, possibly involving dependencies. We study these issues in the context of high-dimensional sparse linear regression, and propose novel estimators for the cases of noisy, missing, and/or dependent data. Many standard approaches to noisy or missing data, such as those using the EM algorithm, lead to optimization problems that are inherently non-convex, and it is difficult to establish theoretical guarantees on practical algorithms. While our approach also involves optimizing non-convex programs, we are able to both analyze the statistical error associated with any global optimum, and prove that a simple projected gradient descent algorithm will converge in polynomial time to a small neighborhood of the set of global minimizers. On the statistical side, we provide non-asymptotic bounds that hold with high probability for the cases of noisy, missing, and/or dependent data. On the computational side, we prove that under the same types of conditions required for statistical consistency, the projected gradient descent algorithm will converge at geometric rates to a near-global minimizer. We illustrate these theoretical predictions with simulations, showing agreement with the predicted scalings.
关 键 词: 高维回归; 数据预测
课程来源: 视频讲座网
数据采集: 2020-11-30:zyk
最后编审: 2020-11-30:zyk
阅读次数: 36