视频标注的大规模训练框架Large-Scale Training Framework for Video Annotation |
|
课程网址: | http://videolectures.net/kdd2019_hwang_lee_varadarajan/ |
主讲教师: | Seong Jae Hwang |
开课单位: | 威斯康辛大学麦迪逊分校 |
开课时间: | 2020-03-02 |
课程语种: | 英语 |
中文简介: | 视频是网上最丰富的信息来源之一,但从互联网范围内的视频内容中提取深层次的见解仍然是一个开放的问题,无论是在理解的深度和广度方面,还是在规模方面。在过去几年中,由于大规模视频数据集的可用性以及图像、音频和视频建模架构的核心进展,视频理解领域取得了长足进步。然而,小规模数据集上最先进的体系结构通常不适合在互联网规模上部署,无论是在数亿视频上训练此类深层网络的能力,还是在数十亿视频上部署用于推断的能力。在本文中,我们提出了一个基于MapReduce的训练框架,它利用数据并行和模型并行来缩放复杂视频模型的训练。提出的框架使用交替优化和全批微调,支持具有数十万个混合的大型“专家混合”分类器,从而能够在模型深度和广度之间进行权衡,并能够在共享(泛化)层和每类(专门化)层之间转换模型容量。我们证明,提议的框架能够在最大的公共视频数据集YouTube-8M和Sports-1M上达到最先进的性能,并且可以扩展到100倍的数据集。 |
课程简介: | Video is one of the richest sources of information available online but extracting deep insights from video content at internet scale is still an open problem, both in terms of depth and breadth of understanding, as well as scale. Over the last few years, the field of video understanding has made great strides due to the availability of large-scale video datasets and core advances in image, audio, and video modeling architectures. However, the state-of-the-art architectures on small scale datasets are frequently impractical to deploy at internet scale, both in terms of the ability to train such deep networks on hundreds of millions of videos, and to deploy them for inference on billions of videos. In this paper, we present a MapReduce-based training framework, which exploits both data parallelism and model parallelism to scale training of complex video models. The proposed framework uses alternating optimization and full-batch fine-tuning, and supports large Mixture-of-Experts classifiers with hundreds of thousands of mixtures, which enables a trade-off between model depth and breadth, and the ability to shift model capacity between shared (generalization) layers and per-class (specialization) layers. We demonstrate that the proposed framework is able to reach state-of-the-art performance on the largest public video datasets, YouTube-8M and Sports-1M, and can scale to 100 times larger datasets. |
关 键 词: | 视频标注; 大规模训练框架; 数据科学 |
课程来源: | 视频讲座网 |
数据采集: | 2022-09-19:cyh |
最后编审: | 2022-09-19:cyh |
阅读次数: | 25 |