0


视觉词汇中的语义嵌入

Towards Semantic Embedding in Visual Vocabulary
课程网址: http://videolectures.net/cvpr2010_ji_tsev/  
主讲教师: Rongrong Ji
开课单位: 哈尔滨工业大学
开课时间: 2010-07-19
课程语种: 英语
中文简介:
视觉词汇作为许多计算机视觉任务的基本组成部分,例如物体识别,视觉搜索和场景建模。虽然现有技术的方法仅仅基于局部图像块的视觉统计来构建视觉词汇,但是相关图像标签在生成视觉词时未被开发。在这项工作中,我们提出了一个语义嵌入框架,用于整合来自Flickr标签的语义信息,用于监督词汇构建。我们的主要贡献是用于监督特征空间量化的隐马尔可夫随机场建模,并对标记相关性进行了专门考虑:将局部视觉特征建模为观察场,其遵循视觉度量来划分特征空间。语义标签被建模为HiddenField,它对ObservedField强制生成监督,基于WordNet的相关约束为吉布斯分布。通过简化马尔可夫性在矿井隐藏字段,两个无监督和监督(labelindependent)词汇可以来源于我们的framework.We验证了我们两个challengingcomputer视觉任务的性能相比较之下艺术的状态:在(1)大型图片搜索Flickr 60,000数据库;(2)PASCAL VOC数据库上的对象识别。
课程简介: Visual vocabulary serves as a fundamental component in many computer vision tasks, such as object recognition, visual search, and scene modeling. While state-of-the-art approaches build visual vocabulary based solely on visual statistics of local image patches, the correlative image labels are left unexploited in generating visual words. In this work, we present a semantic embedding framework to integrate semantic information from Flickr labels for supervised vocabulary construction. Our main contribution is a Hidden Markov Random Field modeling to supervise feature space quantization, with specialized considerations to label correlations: Local visual features are modeled as an Observed Field, which follows visual metrics to partition feature space. Semantic labels are modeled as a Hidden Field, which imposes generative supervision to the Observed Field with WordNet-based correlation constraints as Gibbs distribution. By simplifying the Markov property in the Hidden Field, both unsupervised and supervised (label independent) vocabularies can be derived from our framework. We validate our performances in two challenging computer vision tasks with comparisons to state-of-the-arts: (1) Large-scale image search on a Flickr 60,000 database; (2) Object recognition on the PASCAL VOC database.
关 键 词: 视觉词汇; 计算机视觉任务; 语义嵌入
课程来源: 视频讲座网
最后编审: 2019-03-13:lxf
阅读次数: 50