0


数据流与亲和传播

Data Streaming with Affinity Propagation
课程网址: http://videolectures.net/ecmlpkdd08_zhang_dswap/  
主讲教师: Cyril Furtlehner; Michèle Sebag; Xiangliang Zhang
开课单位: 巴黎第十一大学
开课时间: 2008-10-10
课程语种: 英语
中文简介:
本文提出了一种将亲和传播(AP)扩展到数据流的方法。AP是一种新的聚类算法,它提取数据项或示例,这些数据项或示例使用消息传递方法最好地表示数据集。制作带子有几个步骤。第一个(加权应付账款)将应付账款扩展到加权项目,不损失一般性。第二种方法(层次WAP)是通过在数据子集上应用AP以及在从所有子集中提取的示例上进一步应用加权AP来降低二次AP的复杂性。最后,Strap扩展了层次化的WAP,以处理数据分布的变化。在入侵检测基准(KDD99)上进行的人工数据集实验以及在实际问题上对提交给egee网格系统的作业流进行聚类,为该方法提供了比较验证。
课程简介: This paper proposed StrAP (Streaming AP), extending Affinity Propagation (AP) to data steaming. AP, a new clustering algorithm, extracts the data items, or exemplars, that best represent the dataset using a message passing method. Several steps are made to build StrAP. The first one (Weighted AP) extends AP to weighted items with no loss of generality. The second one (Hierarchical WAP) is concerned with reducing the quadratic AP complexity, by applying AP on data subsets and further applying Weighted AP on the exemplars extracted from all subsets. Finally StrAP extends Hierarchical WAP to deal with changes in the data distribution. Experiments on artificial datasets, on the Intrusion Detection benchmark (KDD99) and on a real-world problem, clustering the stream of jobs submitted to the EGEE grid system, provide a comparative validation of the approach.
关 键 词: 算法; 数据流; 样本; 加权
课程来源: 视频讲座网
最后编审: 2020-07-30:yumf
阅读次数: 163