0


我们需要更多的训练数据还是更好的目标检测模型?

Do We Need More Training Data or Better Models for Object Detection?
课程网址: http://videolectures.net/bmvc2012_fowlkes_object_detection/  
主讲教师: Charless C. Fowlkes
开课单位: 加州大学
开课时间: 2012-10-09
课程语种: 英语
中文简介:

用于训练对象识别系统的数据集的规模正在稳步增长。本文研究了以下问题:现有的检测器是否会随着数据的增长而继续改善,还是由于有限的模型复杂性以及与它们运行所在的特征空间相关的贝叶斯风险而使模型接近饱和。我们专注于流行的扫描窗口模板范例,该模板是在具有判别性分类器的定向梯度特征上定义的。我们调查模板的混合性能作为模板数量(复杂性)和训练数据量的函数。我们发现,其他数据确实有帮助,但只有正确地对训练数据中的嘈杂示例或“异常值”进行正则化和处理。令人惊讶的是,与问题域无关的混合模型的性能似乎很快达到饱和(每个模板10个模板和100个积极的训练示例)。但是,成分混合(通过组成部分实现)的性能要好得多,因为它们在模板之间共享参数,并且可以合成训练期间未遇到的新模板。这表明通过改进表示和学习算法,线性分类器和现有特征空间仍有提高性能的空间。

课程简介: Datasets for training object recognition systems are steadily growing in size. This paper investigates the question of whether existing detectors will continue to improve as data grows, or if models are close to saturating due to limited model complexity and the Bayes risk associated with the feature spaces in which they operate. We focus on the popular paradigm of scanning-window templates defined on oriented gradient features, trained with discriminative classifiers. We investigate the performance of mixtures of templates as a function of the number of templates (complexity) and the amount of training data. We find that additional data does help, but only with correct regularization and treatment of noisy examples or “outliers” in the training data. Surprisingly, the performance of problem domain-agnostic mixture models appears to saturate quickly (10 templates and 100 positive training examples per template). However, compositional mixtures (implemented via composed parts) give much better performance because they share parameters among templates, and can synthesize new templates not encountered during training. This suggests there is still room to improve performance with linear classifiers and the existing feature space by improved representations and learning algorithms.
关 键 词: 数据集; 算法学习
课程来源: 视频讲座网
数据采集: 2021-04-07:zyk
最后编审: 2021-04-07:zyk
阅读次数: 35