0


分发的草绘和流式传输

Sketching and Streaming for Distributions
课程网址: http://videolectures.net/ripd07_mcgregor_ssd/  
主讲教师: Andrew McGregor
开课单位: 加州大学圣地亚哥分校
开课时间: 2008-02-25
课程语种: 英语
中文简介:
在本次演讲中,我们将讨论在数据流模型中草绘分布的问题。这种模式在过去十年中变得越来越流行,因为各个领域的从业者都试图设计以时间和空间有效的方式处理大量数据的系统。诸如估计两个流之间的距离,测试独立性或识别相关性以及确定分布是否可压缩等问题起着重要作用。我们首先回顾使用$ p $ stable分布的结果来计算可用于估计两个分布之间的$ L_p $距离的小空间草图。然后,我们提出了扩展这项工作的最新结果,以估计两个分布之间相关性的强度。我们概述了旨在表征这些技术极限的工作,特别强调了草绘信息差异的可能性,例如Kullback Leibler,Jensen Shannon和Hellinger差异。
课程简介: In this talk we look at the problem of sketching distributions in the data-stream model. This is a model that has become increasingly popular over the last ten years as practitioners in a variety of areas have sought to design systems that handle massive amounts of data in a time and space efficient manner. Problems such as estimating the distance between two streams, testing independence or identifying correlations, and determining if a distribution is compressible play an important role. We start by reviewing results on using $p$-stable distributions to compute small-space sketches that can be used to estimate the $L_p$ distance between two distributions. We then present recent results on extending this work to estimate the strength of correlations between two distributions. We finish with an overview of work that seeks to characterize the limits of these techniques with a particular emphasis on what is possible in regards to sketching information divergences such as the Kullback-Leibler, Jensen-Shannon, and Hellinger divergences.
关 键 词: 数据流模型; 草绘分布; 数据
课程来源: 视频讲座网
最后编审: 2019-09-17:lxf
阅读次数: 30