0


基于核密度估计的快速聚类

Fast Clustering based on Kernel Density Estimation
课程网址: http://videolectures.net/ida07_hinneburg_dfc/  
主讲教师: Alexander Hinneburg
开课单位: 马丁路德大学
开课时间: 2007-10-08
课程语种: 英语
中文简介:
Denclue算法采用基于核密度估计的聚类模型。簇由估计的密度函数的局部最大值定义。通过爬山将数据点分配给聚类,即,将达到相同局部最大值的点放入同一聚类中。 Denclue 1.0的一个缺点是,使用过的爬坡可能会在开始时产生不必要的小步骤,并且永远不会精确地收敛到最大值,它只是接近。我们为高斯内核引入了一种新的爬山程序,可自动调整步长,无需额外费用。我们通过将其简化为期望最大化算法的特殊情况来证明该过程精确地收敛于局部最大值。我们通过实验证明,新程序需要的迭代次数要少得多,并且可以通过基于采样的方法加速,只需要牺牲少量精度。
课程简介: The Denclue algorithm employs a cluster model based on kernel density estimation. A cluster is defined by a local maximum of the estimated density function. Data points are assigned to clusters by hill climbing, i.e. points going to the same local maximum are put into the same cluster. A disadvantage of Denclue 1.0 is, that the used hill climbing may make unnecessary small steps in the beginning and never converges exactly to the maximum, it just comes close. We introduce a new hill climbing procedure for Gaussian kernels, which adjusts the step size automatically at no extra costs. We prove that the procedure converges exactly towards a local maximum by reducing it to a special case of the expectation maximization algorithm. We show experimentally that the new procedure needs much less iterations and can be accelerated by sampling based methods with sacrificing only a small amount of accuracy.
关 键 词: 核密度估计; 簇由估计; 密度函数
课程来源: 视频讲座网
最后编审: 2019-04-27:cwx
阅读次数: 108