Compact Sparse R-CNN: Speeding up sparse R-CNN by reducing iterative detection heads and simplifying feature pyramid network-Reference-Cited by-同舟云学术

Compact Sparse R-CNN: Speeding up sparse R-CNN by reducing iterative detection heads and simplifying feature pyramid network

Published:2023-05-01 Issue:5 Volume:13 Page:
ISSN:2158-3226
Container-title:AIP Advances
language:en
Short-container-title:

Author:

He Zihang¹^ORCID,Ye Xiang¹^ORCID,Li Yong¹

Affiliation:

1. School of Electronic Engineering, Beijing University of Posts and Telecommunications , 10 Xitucheng Road, Beijing, China

Abstract

Processing a large number of proposals usually takes a significant proportion of inference time in two-stage object detection methods. Sparse regions with CNN features (Sparse R-CNN) was proposed using a small number of learnable proposals to replace the proposals derived from anchors. To decrease the missing rate, Sparse R-CNN uses six iterative detection heads to gradually regress the detection boxes to the corresponding objects, which hence increases the inference time. To reduce the number of iterative heads, we propose the iterative Hungarian assigner that encourages Sparse R-CNN to generate multiple proposals for each object at the inference stage. This decreases the missing rate when the number of iterative heads is small. As a result, Sparse R-CNN using the proposed assigner needs fewer iterative heads but gives higher detection accuracy. Also, we observe that the multi-layer outputs of the feature pyramid network contribute little to Sparse R-CNN and propose using a single-layer output neck to replace it. The single-layer output neck further improves the inference speed of Sparse R-CNN without the cost of detection accuracy. Experimental results show that the proposed iterative Hungarian assigner together with the single-layer output neck improves Sparse R-CNN by 2.5 AP50 on the Microsoft common objects in context (MS-COCO) dataset and improves Sparse R-CNN by 3.0 AP50 on the PASCAL visual object classes (VOC) dataset while decreasing 30% floating point operations (FLOPs).

Funder

National Natural Science Foundation of China

Beijing Key Laboratory of Work Safety Intelligent Monitoring, Beijing University of Posts and Telecommunications

Publisher

AIP Publishing

Subject

General Physics and Astronomy

Link

https://pubs.aip.org/aip/adv/article-pdf/doi/10.1063/5.0146453/17320803/055205_1_5.0146453.pdf

Reference36 articles.

1. Defect detection in vehicle mirror nonplanar surfaces with multi-scale atrous single-shot detect mechanism;AIP Adv.,2021

2. A detection method for impact point water columns based on improved YOLO X;AIP Adv.,2022

3. Focal loss for dense object detection,2017

4. CenterNet: Keypoint triplets for object detection,2019

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. STD-YOLOv8: A lightweight small target detection algorithm for UAV perspectives;Electronic Research Archive;2024