0


SSD:单次多盒检测器

SSD: Single Shot MultiBox Detector
课程网址: http://videolectures.net/eccv2016_anguelov_multibox_detector/  
主讲教师: Dragomir Anguelov
开课单位: 视频讲座网
开课时间: 2016-10-24
课程语种: 英语
中文简介:
我们提出了一种利用单一深度神经网络检测图像目标的方法。我们的方法称为SSD,它将边界框的输出空间离散到一组默认框中,并根据不同的长宽比和每个特征映射位置的比例进行缩放。在预测时,网络为每个默认框中每个对象类别的存在度生成分数,并对框进行调整,以更好地匹配对象形状。此外,该网络结合了来自不同分辨率的多个特征图的预测,以自然地处理各种大小的对象。我们的SSD模型相对于需要对象提议的方法是简单的,因为它完全消除了提议的生成和后续的像素或特征重采样阶段,并将所有的计算封装在一个单一的网络中。这使得SSD易于训练,并且可以直接集成到需要检测组件的系统中。在PASCAL VOC、MS COCO和ILSVRC数据集上的实验结果证实,SSD与使用额外对象提议步骤的方法具有相当的准确性,并且速度更快,同时为训练和推断提供了统一的框架。与其他单级方法相比,SSD在输入图像尺寸较小的情况下,具有更好的精度。对于300×300输入,SSD在VOC2007测试中以58 FPS在Nvidia Titan X上实现了72.1%的mAP,对于500×500输入,SSD实现了75.1%的mAP,优于可比的最先进的Faster R-CNN模型。代码可从[url]ttps://git[url]ub.com/weiliu89/caffe/tree/ssd获得。
课程简介: We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location. At prediction time, the network generates scores for the presence of each object category in each default box and produces adjustments to the box to better match the object shape. Additionally, the network combines predictions from multiple feature maps with different resolutions to naturally handle objects of various sizes. Our SSD model is simple relative to methods that require object proposals because it completely eliminates proposal generation and subsequent pixel or feature resampling stage and encapsulates all computation in a single network. This makes SSD easy to train and straightforward to integrate into systems that require a detection component.Experimental results on the PASCAL VOC, MS COCO, and ILSVRC datasets confirm that SSD has comparable accuracy to methods that utilize an additional object proposal step and is much faster, while providing a unified framework for both training and inference. Compared to other single stage methods, SSD has much better accuracy, even with a smaller input image size. For 300×300 input, SSD achieves 72.1% mAP on VOC2007 test at 58 FPS on a Nvidia Titan X and for 500×500 input, SSD achieves 75.1% mAP, outperforming a comparable state of the art Faster R-CNN model. Code is available at
关 键 词: 神经网络; 图像检测; 映射位置
课程来源: 视频讲座网
数据采集: 2022-11-25:chenxin01
最后编审: 2022-11-25:chenxin01
阅读次数: 32