0


开放内核

OpenKernel
课程网址: http://videolectures.net/icml2010_allauzen_opke/  
主讲教师: Cyril Allauzen
开课单位: 谷歌公司
开课时间: 2010-07-20
课程语种: 英语
中文简介:
OpenKernel库是一个开源软件库,用于设计,组合,学习和使用内核用于机器学习应用程序库支持在密集和稀疏实数向量上定义的内核的设计和使用,以及序列或序列的分布。和稀疏特征,该库提供了经典内核的实现:线性,多项式,高斯和S形。对于序列的序列和分布,该库实现了Cortes等人的合理内核框架。 (JMLR,2004)。该库提供以下序列内核:n gram内核,gappy n gram内核,不匹配内核(Leslie et al。,2004),并通过提供它们所基于的加权有限状态传感器,给出了创建任意有理内核的实用程序。核可以通过取它们的和或它们的乘积来组合,并且可以用多项式,高斯或S形组成。它们支持按需评估和缓存。除了自己的二进制格式之外,库还使用LIBSVM / LIBLINEAR / SVMlight的ASCII格式来表示功能(以及LIBSVM的预先计算的内核)。最后,OpenKernel库还包括几个使用训练数据自动组合多个内核的选项。当任务的单个最佳内核未知时,这尤其有用。实现的算法包括L1正则化线性组合(Lanckriet等人,JMLR 2004); L2正则化线性组合(Cortes等人,UAI 2009); L2正则化二次组合(Cortes等人,NIPS 2009),以及核相关或对齐(Cortes等人,ICML 2010),基于组合。这些算法的专用高效版本也可用于加权特征和稀疏度,并可用于进一步提高效率。输出内核可以很容易地与LIBSVM,SVMlight和包含的内核岭回归实现结合使用。包含完整的参考文档,教程和示例(带有格式化数据集)。该库是一个在Apache许可证(2.0)下分发的开源项目。这项工作得到了谷歌公司的部分支持。该库使用OpenFst库来表示和操纵加权有限状态传感器。
课程简介: The OpenKernel library is an open-source software library for designing, combining, learning and using kernels for machine learning applications The library supports the design and use of kernels defined over dense and sparse real vectors, as well as over sequences or distributions of sequences. For dense and sparse features, the library provides implementation of the classical kernels: linear, polynomial, Gaussian and sigmoid. For sequences and distributions of sequences, the library implements the rational kernel framework of Cortes et al. (JMLR, 2004). The library supplies the following sequence kernels: n -gram kernels, gappy n-gram kernels, mismatch kernels (Leslie et al., 2004), and gives the utilities for creating arbitrary rational kernels simply by providing the weighted finite-state transducers they are based on. Kernels can be combined by taking their sum or their product, and can be composed with a polynomial, a Gaussian or a sigmoid. They support on-demand evaluation and caching. In addition to its own binary format, the library uses the ASCII format of LIBSVM/LIBLINEAR/SVMlight for representing features (and precomputed kernels for LIBSVM). Finally, the OpenKernel library also includes several options for using training data to automatically combine multiple kernels. This is particularly useful when the single best kernel for the task is not known. The algorithms implemented include L1-regularized linear combinations (Lanckriet et al. JMLR 2004); L2-regularized linear combinations (Cortes et al. UAI 2009); L2-regularized quadratic combinations (Cortes et al. NIPS 2009), as well as kernel correlation, or alignment (Cortes et al. ICML 2010), based combinations. Specialized efficient versions of these algorithms are also made available for weighting features and sparseness and can be used to further improve efficiency. The output kernels can be easily used in conjunction with LIBSVM, SVMlight and included kernel ridge regression implementations. Full reference documentation, tutorials and examples (with formatted datasets) are included. The library is an open-source project distributed under the Apache license (2.0). This work has been partially supported by Google Inc. The library uses the OpenFst library for representing and manipulating weighted finite-state transducers.
关 键 词: 开源软件库; 稀疏特征; 经典内核
课程来源: 视频讲座网
最后编审: 2019-04-24:cwx
阅读次数: 52