0


基于交替线性化优化的半监督稀疏度量学习

Semi-Supervised Sparse Metric Learning Using Alternating Linearization Optimization
课程网址: http://videolectures.net/kdd2010_liu_sss/  
主讲教师: Wei Liu
开课单位: 哥伦比亚大学
开课时间: 2010-10-01
课程语种: 英语
中文简介:
在很多场景中,数据可以表示为向量,然后在数学上抽象为欧几里得空间中的点。由于大量机器学习和数据挖掘应用需要对数据进行邻近度量,因此需要简单且通用的距离度量,并且已经探索了度量学习方法以产生与数据关系一致的合理距离度量。然而,大多数现有方法遭受有限的标记数据和昂贵的培训。在本文中,我们通过使用丰富的未标记数据和追求稀疏度量来解决这两个问题,从而产生一种新的度量学习方法,称为半监督稀疏度量学习。我们的方法的两个重要贡献是:1)它将数据之间稀缺的先前亲和力传播到全球范围,并将完全相关性纳入度量学习; 2)它使用有效的交替线性化方法直接优化稀疏度量。与传统方法相比,我们可以有效地利用半监督,自动发现输入数据模式下的稀疏度量结构。我们通过对六个数据集进行的大量实验证明了所提方法的有效性,从而获得了超过现有技术的明显性能提升。
课程简介: In plenty of scenarios, data can be represented as vectors and then mathematically abstracted as points in a Euclidean space. Because a great number of machine learning and data mining applications need proximity measures over data, a simple and universal distance metric is desirable, and metric learning methods have been explored to produce sensible distance measures consistent with data relationship. However, most existing methods suffer from limited labeled data and expensive training. In this paper, we address these two issues through employing abundant unlabeled data and pursuing sparsity of metrics, resulting in a novel metric learning approach called semi-supervised sparse metric learning. Two important contributions of our approach are: 1) it propagates scarce prior affinities between data to the global scope and incorporates the full affinities into the metric learning; and 2) it uses an efficient alternating linearization method to directly optimize the sparse metric. Compared with conventional methods, ours can effectively take advantage of semi-supervision and automatically discover the sparse metric structure underlying input data patterns. We demonstrate the efficacy of the proposed approach with extensive experiments carried out on six datasets, obtaining clear performance gains over the state-of-the-arts.
关 键 词: 欧几里得空间; 机器学习; 数据挖掘
课程来源: 视频讲座网
最后编审: 2019-05-11:lxf
阅读次数: 31