0


论计算统计接口与“大数据”

On the Computational and Statistical Interface and "BIG DATA"
课程网址: http://videolectures.net/colt2014_jordan_bigdata/  
主讲教师: Michael I. Jordan
开课单位: 加州大学
开课时间: 2014-07-15
课程语种: 英语
中文简介:

科学和技术中数据集的大小和范围的迅速增长,产生了对将统计科学和计算科学融为一体的新型数据分析基础观点的需求。从这些领域的经典观点不足以解决“大数据”中新出现的问题,从计算机科学的基本水平上它们的明显分歧性质就可以明显看出,数据点数量的增长是“复杂性”的根源,必须“复杂性”化。可以通过算法或硬件驯服,而在统计中,数据点数量的增长是“简单性”的来源,因为推断通常更强,并且可以调用渐近结果或集中定理。我们在计算/统计界面上介绍了几个有关该主题的研究性短片,该界面旨在根据统计风险,数据量和“外部性”(例如计算,通信和隐私)之间的理论折衷来表征。

课程简介: The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the statistical and computational sciences. That classical perspectives from these fields are not adequate to address emerging problems in "Big Data" is apparent from their sharply divergent nature at an elementary level-in computer science, the growth of the number of data points is a source of "complexity" that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of "simplicity" in that inferences are generally stronger and asymptotic results or concentration theorems can be invoked. We present several research vignettes on topics at the computation/statistics interface, an interface that we aim to characterize in terms of theoretical tradeoffs between statistical risk, amount of data and "externalities" such as computation, communication and privacy.
关 键 词: 数据集; 数据集; 算法统计
课程来源: 视频讲座网
数据采集: 2021-03-10:zyk
最后编审: 2021-03-11:zyk
阅读次数: 46