
Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
课程网址: http://videolectures.net/icml2015_xu_visual_attention/  
主讲教师: Kelvin Xu
开课单位: 蒙特利尔大学
开课时间: 2015-12-05
课程语种: 英语
受机器翻译和对象检测领域最近工作的启发,我们介绍了一种基于注意力的模型,该模型可以自动学习描述图像的内容。我们描述了如何使用标准反向传播技术以确定性的方式训练该模型,并通过最大化变分下界来随机训练。我们还通过可视化展示了模型如何能够自动学习将目光固定在显著物体上,同时在输出序列中生成相应的单词。我们在三个基准数据集上以最先进的性能验证了注意力的使用:Flickr8k、Flickr30k和MS COCO。
课程简介: Inspired by recent work in machine translation and object detection, we introduce an attention based model that automatically learns to describe the content of images. We describe how we can train this model in a deterministic manner using standard backpropagation techniques and stochastically by maximizing a variational lower bound. We also show through visualization how the model is able to automatically learn to fix its gaze on salient objects while generating the corresponding words in the output sequence. We validate the use of attention with state-of-the-art performance on three benchmark datasets: Flickr8k, Flickr30k and MS COCO.
关 键 词: 视觉注意; 神经图像; 标题生成
课程来源: 视频讲座网
数据采集: 2023-06-08:chenxin01
最后编审: 2023-06-08:chenxin01
阅读次数: 20