多媒体中的自动字符注释][Automated Character Annotation in Multimedia]_MOOC(慕课)境外开放课程

   首页 → 工程与技术科学
   首页 → 计算机科学技术
   首页 → 计算机工程
   首页 → 计算机应用

多媒体中的自动字符注释 Automated Character Annotation in Multimedia


课程网址:	http://videolectures.net/mcvc08_zisserman_acam/
主讲教师:	Andrew Zisserman
开课单位:	牛津大学
开课时间:	2008-02-14
课程语种:	英语
中文简介:	我们描述了使用检测到的面部以及字幕和成绩单形式的现成注释自动识别电影和电视剧中的人物角色的进展。我们描述了如何对齐字幕和副本以对镜头中出现的角色（以及动作，情感，位置等）进行弱监督。由于通信问题，监督很弱，而且可能看不到人物。面部识别的视觉问题具有挑战性，因为面部以各种尺寸和姿势出现在图像中，并且在表达方面也有很大差异。幸运的是，视频包含表格中每个人的多个面部示例，可以使用简单的视觉跟踪轻松自动关联。这些例子减少了识别的模糊性。我们表明，通过说话人检测可以加强文本监督。虽然标签仍然是不完整和嘈杂的，但是足以学习用于识别的视觉模型，并且实现成功的字符识别。这是与Mark Everingham和Josef Sivic的联合工作。
课程简介:	We describe progress in automatically identifying characters in films and TV series using their detected faces together with readily available annotation in the form of subtitles and transcripts. We describe how the subtitles and transcript can be aligned to give weak supervision on the characters present in a shot (as well as on the actions, emotions, locations etc). The supervision is weak because of correspondence problems and the character may not be visible. The visual problem of face recognition is challenging because faces appear in images at various sizes and pose, and also vary considerably in expression. Fortunately, videos contain multiple face examples of each person in a form that can easily be associated automatically using straightforward visual tracking. These multiple examples reduce the ambiguity of recognition. We show that the text supervision can be strengthened by speaker detection. Although the labelling is still incomplete and noisy, it is then sufficient to learn visual models for recognition, and achieve successful character identification. This is joint work with Mark Everingham and Josef Sivic.
关键词:	视觉模型; 现成注释自动识别; 视觉跟踪; 自动关联
课程来源:	视频讲座网
最后编审:	2019-05-16：cjy
阅读次数:	50

服务热线：0574-88229129
电子邮件：info_lib@nbt.edu.cn
信息服务：图书馆306室
系统研发：图书馆321室

图书馆学生服务群：437507696
图书馆教工服务群：1038697975
QQ在线咨询
2013-2024 © 浙大宁波理工学院图书馆