0


基于草图的最优分布子模优化

Optimal Distributed Submodular Optimization via Sketching
课程网址: http://videolectures.net/kdd2018_bateni_distributed_submodular/  
主讲教师: Mohammadhossein Bateni
开课单位: Google, Inc.
开课时间: 2018-11-23
课程语种: 英语
中文简介:
我们提出了若干类子模优化问题的分布式算法,如k覆盖、集合覆盖、设施位置和概率覆盖。新算法具有几乎最优的空间复杂度、最优的近似保证、最优的通信复杂度(并且仅在四轮计算中运行),解决了现有工作的主要缺点。我们首先提出了一种每台机器仅使用O ̄(n)空间的k覆盖分布式算法,然后将其扩展到几个子模优化问题,改进了所有上述问题的先前结果——例如,我们的设施位置问题算法改进了最佳公知算法的空间(Lindgren等人[26])。我们的算法可以在各种分布式框架中实现,如MapReduce和RAM模型。在硬度方面,我们通过信息论论证了均匀采样的局限性。此外,我们在各种数据集上对我们的算法(在MapReduce中实现)进行了广泛的实证研究。我们观察到,使用比输入小30-600倍的草图,可以解决覆盖最大化问题,质量非常接近最先进的单机算法。最后,我们展示了我们的算法在大规模特征选择中的应用。
课程简介: We present distributed algorithms for several classes of submodular optimization problems such as k-cover, set cover, facility location, and probabilistic coverage. The new algorithms enjoy almost optimal space complexity, optimal approximation guarantees, optimal communication complexity (and run in only four rounds of computation), addressing major shortcomings of prior work. We first present a distributed algorithm for k-cover using only O˜(n) space per machine, and then extend it to several submodular optimization problems, improving previous results for all the above problems—e.g., our algorithm for facility location problem improves the space of the best known algorithm (Lindgren et al. [26]). Our algorithms are implementable in various distributed frameworks such as MapReduce and RAM models. On the hardness side we demonstrate the limitations of uniform sampling via an information theoretic argument. Furthermore, we perform an extensive empirical study of our algorithms (implemented in MapReduce) on a variety of datasets. We observe that using sketches 30–600 times smaller than the input, one can solve the coverage maximization problem with quality very close to that of the state-of-the-art single-machine algorithm. Finally, we demonstrate an application of our algorithm in largescale feature selection.
关 键 词: 基于草图; 最优分布子模优化; k覆盖分布式算法
课程来源: 视频讲座网
数据采集: 2023-03-09:cyh
最后编审: 2023-05-15:cyh
阅读次数: 33