0


聚类理论的新进展

New Developments in the Theory of Clustering
课程网址: http://videolectures.net/kdd2010_vassilvitskii_venkatasubramanian...  
主讲教师: Suresh Venkatasubramanian; Sergei Vassilvitskii
开课单位: 雅虎公司
开课时间: 2010-10-01
课程语种: 英语
中文简介:
理论和应用上的聚类研究往往遵循不同的路径,只有偶尔的利益汇合。在本教程中,我们提供了一个关于集群理论的最新结果的概述,该理论弥合了这一鸿沟,并且是从业者感兴趣的。在K均值算法中,我们提出了一种新的选择初始聚类中心的方法,既得到了可证明的逼近保证,又提高了聚类质量。我们继续解释为什么该算法在非欧几里得空间中有效,例如,在信息度量(如Kullback-Leibler散度)下进行聚类,并为这些度量提供新的算法。最后,我们讨论了聚类稳定性的最新结果及其对我们判断聚类质量的影响。
课程简介: Theoretical and applied research in clustering have often followed separate paths, with only the occasional confluence of interest. In this tutorial, we provide an overview of recent results in the theory of clustering that bridge this divide and are of interest to practitioners. We describe a new approach to selecting the initial cluster centers in the k-means algorithm, which leads both to provable approximation guarantees, and practical improvements in the quality of the clustering. We continue by explaining why the algorithm works in non-Euclidean spaces, for example, for clustering under information measures like the Kullback-Leibler divergence, and present new algorithms for these metrics. Finally, we discuss recent results on the stability of clusterings and their implication for our ability to judge the quality of a clustering.
关 键 词: 聚类理论; 利益汇合; k-均值算法; 聚类分析
课程来源: 视频讲座网
最后编审: 2020-01-13:chenxin
阅读次数: 64