0


光谱聚类的特征向量敏感特征选择

Eigenvector Sensitive Feature Selection For Spectral Clustering
课程网址: http://videolectures.net/ecmlpkdd2011_ren_clustering/  
主讲教师: Jiangtao Ren
开课单位: 中山大学
开课时间: 2011-11-30
课程语种: 汉简
中文简介:
谱聚类是数据聚类最常用的方法之一,其性能取决于相关图拉普拉斯特征向量的质量。一般来说,图拉普拉斯函数是利用全特征构造的,当数据集中存在大量的噪声或无关特征时,这会降低相关特征向量的质量。为了解决这一问题,我们提出了一种新的基于微扰分析理论的无监督特征选择方法,讨论了矩阵特征向量的微扰与其元素微扰之间的关系。我们根据图拉普拉斯的前k个特征向量与k最小的正特征值对应的扰动的平均l1范数来评估每个特征相对于特征扰动的重要性。在多个高维多类数据集上进行的大量实验表明,与一些最先进的无监督特征选择方法相比,我们的方法具有良好的性能。
课程简介: Spectral clustering is one of the most popular methods for data clustering, and its performance is determined by the quality of the eigenvectors of the related graph Laplacian. Generally, graph Laplacian is constructed using the full features, which will degrade the quality of the related eigenvectors when there are a large number of noisy or irrelevant features in datasets. To solve this problem, we propose a novel unsupervised feature selection method inspired by perturbation analysis theory, which discusses the relationship between the perturbation of the eigenvectors of a matrix and its elements' perturbation. We evaluate the importance of each feature based on the average L1 norm of the perturbation of the first k eigenvectors of graph Laplacian corresponding to the k smallest positive eigenvalues, with respect to the feature's perturbation. Extensive experiments on several high-dimensional multi-class datasets demonstrate the good performance of our method compared with some state-of-the-art unsupervised feature selection methods.
关 键 词: 核方法; 核主成分分析; 计算机科学; 机器学习; 特征选择
课程来源: 视频讲座网
最后编审: 2019-11-30:lxf
阅读次数: 39