
Persistence-based Clustering
课程网址: http://videolectures.net/solomon_skraba_pbc/  
主讲教师: Primož Škraba
开课单位: 约瑟夫·斯特凡学院
开课时间: 2010-03-26
课程语种: 英语
课程简介: Clustering is a classical problem which looks for important segments in an unstructured data set. In general, this is an ill-posed problem. A common approach is to consider the data set as a sample of an unknown probability distribution function on some underlying space. Clustering then becomes a problem of understanding the behaviour of the distribution function. In this talk, I will introduce persistence-based clustering. Under some mild assumptions, the algorithm comes with a variety of strong theoretical guarantees. In particular, it provably approximates the structure of the underlying distribution function even when underlying space is only approximately known. The approach is based heavily on persistent homology (also refered to as topological persistence), a relatively recent development in the area of computational topology. It is precisely this framework which makes many of the proofs possible. The talk will include a general introduction to persistence so no prior knowledge is expected. On the practical side, the algorithm is efficient, both in memory size and running time, so it can handle large, high dimensional data sets quickly. Finally, it provides visual feedback in addition to the clusters, something which is particularly useful when the data sets cannot be visualized.
关 键 词: 计算机科学; 机器学习; 聚类
课程来源: 视频讲座网
最后编审: 2020-07-23:yumf
阅读次数: 81