0


使用GPLVM学习两个相关数据集的共享和单独特征

Learning Shared and Separate Features of Two Related Data Sets using GPLVMs
课程网址: http://videolectures.net/lms08_leen_lssf/  
主讲教师: Gayle Leen
开课单位: 阿尔托大学
开课时间: 2008-12-20
课程语种: 英语
中文简介:
双源学习问题可以表述为学习数据的联合表示源,共享信息以共享的底层流程表示。但是,可能存在共享信息不是唯一有用信息的情况,两个数据集并不常见数据的有趣方面。一些有用的功能一个数据集可能不存在另一个数据集,反之亦然;这种互补的财产激励在单个数据源上使用多个数据源,仅捕获一种有用信息。例如,有两只眼睛(和两条视觉数据流)可以让我们获得3-D对世界的印象。这种立体视觉功能结合了共享功能和功能私有化每个数据流,形成一个连贯的世界代表;常见的移位功能可以用于视差估计以推断对象的深度,同时可以是一些特征在一个视图中看到但在另一个视图中没有,由于遮挡,可以提供有关的其他信息现场。在这项工作中,我们提出了一个概率生成框架,用于分析两组数据,其中每个数据集的结构用共享和私有潜在空间表示。对每个数据集明确地建模私有组件避免了内置变化的过度简化表示,使得可以更准确地建模设置间变化,以及洞察特定于数据集的潜在有趣特征。由于两个数据集可能具有复杂(可能是非线性)关系,我们使用非参数贝叶斯技术 - 我们定义了从潜在到数据空间的函数的高斯过程先验,使得每个数据集被建模为高斯过程潜变量模型(GPLVM) )[1]其中依赖结构是根据共享和私有内核捕获的。
课程简介: Dual source learning problems can be formulated as learning a joint representation of the data sources, where the shared information is represented in terms of a shared underlying process. However, there may be situations in which the shared information is not the only useful information, and interesting aspects of the data are not common to both data sets. Some useful features within one data set may not be present in the other and vice versa; this complementary property motivates the use of multiple data sources over single data sources which capture only one type of useful information. For instance, having two eyes (and two streams of visual data) allows us to gain a 3-D impression of the world. This ability of stereo vision combines both shared features and features private to each data stream to form a coherent representation of the world; common shifted features can be used in disparity estimation to infer depths of objects, while some features which may be seen in one view but not in the other, due to occlusions, can provide additional information about the scene. In this work, we present a probabilistic generative framework for analysing two sets of data, where the structure of each data set is represented in terms of a shared and private latent space. Explicitly modeling a private component for each data set avoids an oversimplified representation of the within-set variation such that the between-set variation can be modeled more accurately, as well as giving insight into potentially interesting features particular to a data set. Since two data sets may have a complex (possibly nonlinear) relationship, we use nonparametric Bayesian techniques - we define Gaussian process priors over the functions from latent to data spaces, such that each data set is modelled as a Gaussian Process Latent Variable Model (GPLVM) [1] where the dependency structure is captured in terms of shared and private kernels.
关 键 词: 双源学习问题; 共享信息; 非参数贝叶斯技术; 高斯过程潜变量模型
课程来源: 视频讲座网
最后编审: 2019-05-15:cjy
阅读次数: 58