0


相关聚类:从理论到实践

Correlation Clustering: From Theory to Practice
课程网址: http://videolectures.net/kdd2014_bonchi_garcia_soriano_liberty_cl...  
主讲教师: Edo Liberty; David Garcia-Soriano; Francesco Bonchi
开课单位: 雅虎公司
开课时间: 2014-10-15
课程语种: 英语
中文简介:

相关性聚类可以说是最自然的聚类形式。给定一组对象以及它们之间的成对相似性度量,目标是对这些对象进行聚类,以便在尽可能最大的程度上将相似的对象放置在同一聚类中,将不相似的对象放置在不同的聚类中。由于只需要定义相似性,其广泛的通用性使其适用于不同上下文中的各种问题,尤其使其自然地适合于聚类结构化对象,而特征向量可能难以获得。尽管它具有简单性,通用性和广泛的适用性,但到目前为止,相关性聚类一直受到算法理论界的关注,而不是数据挖掘界的关注。本教程的目的是展示相关性聚类如何成为数据挖掘研究人员和从业人员工具包的有力补充,并鼓励该领域的讨论和进一步的研究。在本教程中,我们将调查该问题及其最常见的变体,重点是为获得有效解决方案而开发的算法技术和关键思想。我们将激发这些问题,并讨论实际应用程序,可能出现的可伸缩性问题以及解决这些问题的现有方法。

课程简介: Correlation clustering is arguably the most natural formulation of clustering. Given a set of objects and a pairwise similarity measure between them, the goal is to cluster the objects so that, to the best possible extent, similar objects are put in the same cluster and dissimilar objects are put in different clusters. As it just needs a definition of similarity, its broad generality makes it applicable to a wide range of problems in different contexts, and in particular makes it naturally suitable to clustering structured objects for which feature vectors can be difficult to obtain. Despite its simplicity, generality and wide applicability, correlation clustering has so far received much more attention from the algorithmic theory community than from the data mining community. The goal of this tutorial is to show how correlation clustering can be a powerful addition to the toolkit of the data mining researcher and practitioner, and to encourage discussions and further research in the area. In the tutorial we will survey the problem and its most common variants, with an emphasis on the algorithmic techniques and key ideas developed to derive efficient solutions. We will motivate the problems and discuss real-world applications, the scalability issues that may arise, and the existing approaches to handle them.
关 键 词: 聚类结构化; 算法技术; 关键思想
课程来源: 视频讲座网
数据采集: 2020-11-05:zyk
最后编审: 2020-11-05:zyk
阅读次数: 49