0


创成二进看点模型证据积累聚类

A Generative Dyadic Aspect Model for Evidence Accumulation Clustering
课程网址: http://videolectures.net/simbad2011_figueiredo_clustering/  
主讲教师: Mário A. T. Figueiredo
开课单位: 里斯本电信学院
开课时间: 2011-10-17
课程语种: 英语
中文简介:
证据积累聚类(EAC)是一种从聚类集合中学习成对相似度矩阵(所谓共相关矩阵)的聚类组合方法。这种共相关矩阵计算对象对的共现(在同一个集群中),从而避免了许多其他集群组合方法所面临的集群对应问题。从共现是一种特殊的二元数据类型的观察出发,提出了利用生成方面模型对二元数据进行共现建模。在该模型下,一致性聚类的提取对应于最大似然估计问题的求解,我们采用期望最大化算法进行求解。我们将所得方法称为概率系综聚类算法(penca)。此外,将问题置于概率框架中的事实允许使用模型选择标准自动选择集群的数量。为了将该方法与其他组合技术(也基于聚类系综问题的概率建模)进行比较,我们对综合和真实的基准数据集进行了实验,结果表明,所提出的方法具有竞争性。
课程简介: Evidence accumulation clustering (EAC) is a clustering combination method in which a pair-wise similarity matrix (the so-called co-association matrix) is learnt from a clustering ensemble. This co-association matrix counts the co-occurrences (in the same cluster) of pairs of objects, thus avoiding the cluster correspondence problem faced by many other clustering combination approaches. Starting from the observation that co-occurrences are a special type of dyads, we propose to model co-association using a generative aspect model for dyadic data. Under the proposed model, the extraction of a consensus clustering corresponds to solving a maximum likelihood estimation problem, which we address using the expectation-maximization algorithm. We refer to the resulting method as probabilistic ensemble clustering algorithm (PEnCA). Moreover, the fact that the problem is placed in a probabilistic framework allows using model selection criteria to automatically choose the number of clusters. To compare our method with other combination techniques (also based on probabilistic modeling of the clustering ensemble problem), we performed experiments with synthetic and real benchmark data-sets, showing that the proposed approach leads to competitive results.
关 键 词: 计算机科学; 机器学习; 集群
课程来源: 视频讲座网
最后编审: 2020-06-01:wuyq
阅读次数: 63