拟合优度、均匀性和独立性的检验分布Testing Distributions for Goodness of fit, Homogeneity, and Independence |
|
课程网址: | http://videolectures.net/ripd07_batu_tdg/ |
主讲教师: | Tugkan Batu |
开课单位: | 伦敦经济学院 |
开课时间: | 2008-02-25 |
课程语种: | 英语 |
中文简介: | 在本次演讲中,我将描述几种基本统计推断任务的算法。本研究的主要焦点是每个任务的样本复杂度,作为基础离散概率分布的域大小的函数。算法只能访问i.i.d.来自分布的样本并基于这些样本进行推断。这里研究的推理任务是:(i)与固定分布的相似性(即拟合优度); (ii)两个分布之间的相似性(即同质性); (iii)联合分配的独立性; (iv)熵估计。对于这些任务中的每一个,都提出了具有次线性样本复杂度的算法(例如,对大小为$ n $的离散域的拟合优度测试显示需要$ O(\ sqrt {n} polylog(n))$ samples )。伴随的下限参数表明除了这些算法之外的所有算法都能达到接近最佳的性能。给定关于分布的一些额外信息(例如,关于域上的固定总订单的分布是单调的或单峰的),这些任务的样本复杂度在域大小中变为多对数。 |
课程简介: | In this talk, I will describe algorithms for several fundamental statistical inference tasks. The main focus of this research is the sample complexity of each task as a function of the domain size for the underlying discrete probability distributions. The algorithms are given access only to i.i.d. samples from the distributions and make inferences based on these samples. The inference tasks studied here are: (i) similarity to a fixed distribution (i.e., goodness-of-fit); (ii) similarity between two distributions (i.e., homogeneity); (iii) independence of joint distributions; and (iv) entropy estimation. For each of these tasks, an algorithm with sublinear sample complexity is presented (e.g., a goodness-of-fit test on a discrete domain of size $n$ is shown to require $O(\sqrt{n}polylog(n))$ samples). Accompanying lower bound arguments show that all but one of these algorithms achieve a near-optimal performance. Given some extra information on the distributions (such as the distribution is monotone or unimodal with respect to a fixed total order on the domain), the sample complexity of these tasks become polylogarithmic in the domain size. |
关 键 词: | 推断任务; 样本复杂度; 离散概率 |
课程来源: | 视频讲座网 |
最后编审: | 2019-09-14:lxf |
阅读次数: | 80 |