0


高斯混合学习的快速抽样近似最优算法

Faster and Sample Near-Optimal Algorithms for Proper Learning Mixtures of Gaussians
课程网址: http://videolectures.net/colt2014_kamath_gaussians/  
主讲教师: Gautam Kamath
开课单位: 麻省理工学院
开课时间: 2014-07-15
课程语种: 英语
中文简介:

我们提供了一种算法,可以在没有任何可分性假设的情况下正确学习两个一维高斯混合。给定来自未知混合物的O(1 /ε2)样本,我们的算法输出的混合物在时间O(1 /ε5)内的总变化距离ε接近。我们的样本复杂度在达到对数因子之前是最优的,并且显着改善了Kalai等人(其算法对1 /ε的抑制作用)以及Feldman等人(其算法需要对混合参数进行限制并依赖于伪多项式)。这些参数。

我们的主要贡献之一是一种改进的通用算法,用于从竞争假设中选择良好的候选分布。即,给定N个假设的集合,其中包含至少一个与ε接近未知分布的候选,我们的算法输出一个与O(ε)接近于分布的候选。该算法需要未知分布中的O(logN /ε2)个样本和O(NlogN /ε2)时间,这将运行时间对N的二次依赖性变为准线性,从而改善了先前的此类结果(例如Scheffé估计量)。考虑到这些结果广泛用于假设选择的目的,我们改进的算法意味着可以立即对任何此类使用进行改进。

课程简介: We provide an algorithm for properly learning mixtures of two single-dimensional Gaussians without any separability assumptions. Given O(1/ε2) samples from an unknown mixture, our algorithm outputs a mixture that is ε-close in total variation distance, in time O(1/ε5). Our sample complexity is optimal up to logarithmic factors, and significantly improves upon both Kalai et al., whose algorithm has a prohibitive dependence on 1/ε, and Feldman et al., whose algorithm requires bounds on the mixture parameters and depends pseudo-polynomially in these parameters. One of our main contributions is an improved and generalized algorithm for selecting a good candidate distribution from among competing hypotheses. Namely, given a collection of N hypotheses containing at least one candidate that is ε-close to an unknown distribution, our algorithm outputs a candidate which is O(ε)-close to the distribution. The algorithm requires O(logN/ε2) samples from the unknown distribution and O(NlogN/ε2) time, which improves previous such results (such as the Scheffé estimator) from a quadratic dependence of the running time on N to quasilinear. Given the wide use of such results for the purpose of hypothesis selection, our improved algorithm implies immediate improvements to any such use.
关 键 词: 最优算法; 高斯学习
课程来源: 视频讲座网
数据采集: 2020-11-23:zyk
最后编审: 2020-11-23:zyk
阅读次数: 38