0


关于计算和统计接口以及“大数据”

27th Annual Conference on Learning Theory (COLT), Barcelona 2014
课程网址: http://videolectures.net/colt2014_jordan_bigdata/  
主讲教师: Michael I. Jordan
开课单位: 加州大学伯克利分校
开课时间: 2014-07-15
课程语种: 英语
中文简介:

科学和技术中数据集的大小和范围的快速增长,产生了对将统计科学和计算科学融合在一起的新型数据分析基础观点的需求。从这些领域的经典观点不足以解决“大数据”中新出现的问题,从计算机科学的基本水平上它们的明显分歧性质就可以明显看出,数据点数量的增长是“复杂性”的根源,必须可以通过算法或硬件加以驯服,而在统计中,数据点数量的增长是“简单性”的来源,因为推断通常更强,并且可以调用渐近结果或集中定理。我们在计算/统计数据界面上提供了几个有关该主题的研究性短片,该界面旨在根据统计风险,数据量和“外部性”(例如计算,通信和隐私)之间的理论折衷来表征。

课程简介: The rapid growth in the size and scope of datasets in science and technology has created a need for novel foundational perspectives on data analysis that blend the statistical and computational sciences. That classical perspectives from these fields are not adequate to address emerging problems in "Big Data" is apparent from their sharply divergent nature at an elementary level-in computer science, the growth of the number of data points is a source of "complexity" that must be tamed via algorithms or hardware, whereas in statistics, the growth of the number of data points is a source of "simplicity" in that inferences are generally stronger and asymptotic results or concentration theorems can be invoked. We present several research vignettes on topics at the computation/statistics interface, an interface that we aim to characterize in terms of theoretical tradeoffs between statistical risk, amount of data and "externalities" such as computation, communication and privacy.
关 键 词: 大数据; 计算学习理论; 统计学习
课程来源: 视频讲座网
数据采集: 2020-06-06:吴淑曼
最后编审: 2020-06-11:chenxin
阅读次数: 50