0


编码相似度学习距离函数

Learning Distance Function by Coding Similarity
课程网址: http://videolectures.net/icml07_kliper_ldfc/  
主讲教师: Rioe Kliper
开课单位: 耶路撒冷希伯来大学
开课时间: 2007-06-23
课程语种: 英语
中文简介:
我们考虑从一组正等价约束(即“相似”点对)学习相似性函数的问题。我们定义信息理论术语中的相似性,作为从该对的独立编码转换到联合编码时的编码长度的增益。在简单的高斯假设下,该公式导致非马哈拉诺比斯相似函数,该函数有效且易于学习。该函数可以被视为似然比检验,并且我们表明数据的最佳相似性保持投影是Fisher线性判别的变体。我们还表明,在一些自然发生的等价约束的采样条件下,该函数收敛于已知的马哈拉诺比斯距离(RCA)。建议的相似性函数表现出优于从相同数据学习的替代马哈拉诺比斯距离的性能。它的优越性在图像检索和基于图形的聚类的背景下得到了证明,使用了大量的数据集。
课程简介: We consider the problem of learning a similarity function from a set of positive equivalence constraints, i.e. "similar" point pairs. We define the similarity in information theoretic terms, as the gain in coding length when shifting from independent encoding of the pair to joint encoding. Under simple Gaussian assumptions, this formulation leads to a non-Mahalanobis similarity function which is effcient and simple to learn. This function can be viewed as a likelihood ratio test, and we show that the optimal similaritypreserving pro jection of the data is a variant of Fisher Linear Discriminant. We also show that under some naturally occurring sampling conditions of equivalence constraints, this function converges to a known Mahalanobis distance (RCA). The suggested similarity function exhibits superior performance over alternative Mahalanobis distances learnt from the same data. Its superiority is demonstrated in the context of image retrieval and graph based clustering, using a large number of data sets.
关 键 词: 相似性函数; 信息理论术语; 等价约束
课程来源: 视频讲座网
最后编审: 2019-04-17:lxf
阅读次数: 48