0


布林值的多聚类数据配置

Multi-Assignment Clustering for Boolean Data
课程网址: http://videolectures.net/icml09_frank_mac/  
主讲教师: Mario Frank
开课单位: 苏黎世联邦理工学院
开课时间: 2009-08-26
课程语种: 英语
中文简介:
传统的集群方法通常假定每个数据项属于一个集群。这种假设一般不成立。为了克服这一局限性,我们提出了一种向量数据聚类的生成方法,在这种方法中,每个对象都可以被分配到多个聚类中。使用确定性退火方案,我们的方法将观察到的数据分解为单个簇的贡献,并推断它们的参数。\\n对合成布尔数据的实验表明,与最先进的方法相比,我们的方法在源参数估计方面获得了更高的精度和更好的簇稳定性。我们还将我们的方法应用于计算机安全中的一个重要问题,即角色挖掘。对实际访问控制数据的实验表明,与其他多重分配方法相比,新员工的综合绩效有所提高。在高噪声水平的挑战环境中,我们的方法保持了良好的性能,而其他最先进的技术缺乏鲁棒性。
课程简介: Conventional clustering methods typically assume that each data item belongs to a single cluster. This assumption does not hold in general. In order to overcome this limitation, we propose a generative method for clustering vectorial data, where each object can be assigned to multiple clusters. Using a deterministic annealing scheme, our method decomposes the observed data into the contributions of individual clusters and infers their parameters.\\ Experiments on synthetic Boolean data show that our method achieves higher accuracy in the source parameter estimation and superior cluster stability compared to state-of-the-art approaches. We also apply our method to an important problem in computer security known as role mining. Experiments on real-world access control data show performance gains in generalization to new employees against other multi-assignment methods. In challenging situations with high noise levels, our approach maintains its good performance, while alternative state-of-the-art techniques lack robustness.
关 键 词: 聚类方法; 布尔数据; 参数估计; 鲁棒性
课程来源: 视频讲座网
最后编审: 2019-12-07:lxf
阅读次数: 35