0


面向自主网格:使用关联流分析作业流

Toward Autonomic Grids: Analyzing the Job Flow with Affinity Streaming
课程网址: http://videolectures.net/kdd09_zhang_tagajfas/  
主讲教师: Xiangliang Zhang
开课单位: 巴黎大学
开课时间: 2009-09-14
课程语种: 英语
中文简介:
Frey和Dueck(2007)提出的亲和传播(AP)聚类算法提供了数据集的可理解的,几乎最优的总结,尽管具有二次计算复杂性。本文以自主计算为动力,将AP扩展到数据流框架。首先,使用分层策略将复杂度降低到$ {\ cal O}(N ^ {1 \ e})$;所引起的失真损失与数据项的维度相关联。其次,使用具有变化检测测试的耦合来处理非静态数据分布,并根据需要重建模型。所提出的方法StrAP应用于提交给EGEE网格的作业流,提供可理解的作业流描述,并使系统管理员能够在线查看一些故障源。
课程简介: The Affinity Propagation (AP) clustering algorithm proposed by Frey and Dueck (2007) provides an understandable, nearly optimal summary of a dataset, albeit with quadratic computational complexity. This paper, motivated by Autonomic Computing, extends AP to the data streaming framework. Firstly a hierarchical strategy is used to reduce the complexity to ${\cal O}(N^{1+\e})$; the distortion loss incurred is analyzed in relation with the dimension of the data items. Secondly, a coupling with a change detection test is used to cope with non-stationary data distribution, and rebuild the model as needed. The presented approach StrAP is applied to the stream of jobs submitted to the EGEE Grid, providing an understandable description of the job flow and enabling the system administrator to spot online some sources of failures.
关 键 词: 亲和传播; 聚类算法; 数据集
课程来源: 视频讲座网
最后编审: 2019-05-10:cwx
阅读次数: 24