0


一种集成多数据视图的矩阵分解方法

A Matrix Factorization Approach for Integrating Multiple Data Views
课程网址: http://videolectures.net/ecmlpkdd09_greene_mfaimdv/  
主讲教师: Derek Greene
开课单位: 都柏林大学
开课时间: 2009-10-20
课程语种: 英语
中文简介:
在许多域中,将存在描述同一组对象的不同表示或“视图”。单独来看,这些观点往往是缺乏或不完整的。因此,探索性数据分析的一个关键问题是集成多个视图以发现域中的底层结构。当视图之间存在分歧时,这个问题变得更加困难。我们引入了一种新的无监督算法,用于使用“后期集成”策略组合来自相关视图的信息。通过将基于矩阵分解的方法应用于在各个视图上产生的分组相关聚类来执行组合。这产生了以覆盖整个域的一组新的“元簇”形式的原始簇的投影。我们还提供了一种新的模型选择策略,用于识别正确的元簇数。对许多多视图文本聚类问题进行的评估证明了算法的有效性。
课程简介: In many domains there will exist different representations or “views” describing the same set of objects. Taken alone, these views will often be deficient or incomplete. Therefore a key problem for exploratory data analysis is the integration of multiple views to discover the underlying structures in a domain. This problem is made more difficult when disagreement exists between views. We introduce a new unsupervised algorithm for combining information from related views, using a “late integration” strategy. Combination is performed by applying an approach based on matrix factorization to group related clusters produced on individual views. This yields a projection of the original clusters in the form of a new set of “meta-clusters” covering the entire domain. We also provide a novel model selection strategy for identifying the correct number of meta-clusters. Evaluations performed on a number of multi-view text clustering problems demonstrate the effectiveness of the algorithm.
关 键 词: 视图; 底层结构; 原始簇
课程来源: 视频讲座网
最后编审: 2019-03-24:cwx
阅读次数: 82