0


Haussler卷积核映射核的推广

A Generalization of Haussler's Convolution Kernel - Mapping Kernel
课程网址: http://videolectures.net/icml08_shin_ghsk/  
主讲教师: Kilho Shin
开课单位: 卡内基梅隆大学
开课时间: 2008-07-28
课程语种: 英语
中文简介:
Haussler的卷积内核为工程新的半正定内核提供了一个成功的框架,并已应用于各种数据类型和应用程序。在框架中,每个数据对象代表一组有限的细粒度组件。然后,Haussler的卷积内核将一对数据对象作为输入,并返回为输入数据对象的所有可能的组件对计算的预定基元内核的返回值之和。由于定义,Haussler的卷积内核也称为交叉产品内核,如果是原始内核则是正半定的。另一方面,我们在本文中介绍的映射内核是Haussler卷积内核的自然概括,因为原始内核的输入移动到预定的子集而不是整个交叉积。虽然我们在文献中有多个映射内核实例,但是它们的正半定性是通过案例方式进行调查的,更糟糕​​的是,有时会错误地得出结论。事实上,存在一个简单且易于检查的必要和充分条件,它在某种意义上是通用的,它使我们能够研究映射内核的任意实例的正半定性。这是第一篇提出并证明该条件有效性的论文。另外,我们介绍了映射内核的两个重要实例,我们将其称为索引结构分发内核的大小和编辑成本分配内核。它们都是自然源自文献中众所周知的(dis)相似性测量(例如最大协议树,编辑距离),并且合理地期望通过评估其分布特征而不是其峰值来改善现有测量的性能。 (最大/最小)功能。
课程简介: Haussler's convolution kernel provides a successful framework for engineering new positive semidefinite kernels, and has been applied to a wide range of data types and applications. In the framework, each data object represents a finite set of finer grained components. Then, Haussler's convolution kernel takes a pair of data objects as input, and returns the sum of the return values of the predetermined primitive kernel calculated for all the possible pairs of the components of the input data objects. Due to the definition, Haussler's convolution kernel is also known as the cross product kernel, and is positive semidefinite, if so is the primitive kernel. On the other hand, the mapping kernel that we introduce in this paper is a natural generalization of Haussler's convolution kernel, in that the input to the primitive kernel moves over a predetermined subset rather than the entire cross product. Although we have plural instances of the mapping kernel in the literature, their positive semidefiniteness was investigated in case-by-case manners, and worse yet, was sometimes incorrectly concluded. In fact, there exists a simple and easily checkable necessary and sufficient condition, which is generic in the sense that it enables us to investigate the positive semidefiniteness of an arbitrary instance of the mapping kernel. This is the first paper that presents and proves the validity of the condition. In addition, we introduce two important instances of the mapping kernel, which we refer to as the size-of-index-structure-distribution kernel and the edit-cost-distribution kernel. Both of them are naturally derived from well known (dis)similarity measurements in the literature (e.g. the maximum agreement tree, the edit distance), and are reasonably expected to improve the performance of the existing measures by evaluating their distributional features rather than their peak (maximum/minimum) features.
关 键 词: 卷积内核; 半正定内核; 细粒度组件
课程来源: 视频讲座网
最后编审: 2019-04-21:lxf
阅读次数: 71