EYOLOX: An Efficient One-Stage Object Detection Network Based on YOLOX
-
Published:2023-01-23
Issue:3
Volume:13
Page:1506
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Tang Rui1, Sun Hui12, Liu Di1, Xu Hui1ORCID, Qi Miao12, Kong Jun3
Affiliation:
1. College of Information Science and Technology, Northeast Normal University, Changchun 130117, China 2. Institute for Intelligent Elderly Care, Changchun Humanities and Sciences College, Changchun 130117, China 3. Key Laboratory of Applied Statistics of MOE, Northeast Normal University, Changchun 130024, China
Abstract
Object detection has drawn the attention of many researchers due to its wide application in computer vision-related applications. In this paper, a novel model is proposed for object detection. Firstly, a new neck is designed for the proposed detection model, including an efficient SPPNet (Spatial Pyramid Pooling Network), a modified NLNet (Non Local Network) and a lightweight adaptive feature fusion module. Secondly, the detection head with double residual branch structure is presented to reduce the delay of a decoupled head and improve the detection ability. Finally, these improvements are embedded in YOLOX as plug-and-play modules for forming a high-performance detector, EYOLOX (EfficientYOLOX). Extensive experiments demonstrate that the EYOLOX achieves significant improvements, which increases YOLOX-s from 40.5% to 42.2% AP on the MS COCO dataset with a single GPU. Moreover, the performance of the detection of EYOLOX also outperforms YOLOv6 and some SOTA methods with the same number of parameters and GFLOPs. In particular, EYOLOX has only been trained on the COCO-2017 dataset without using any other datasets, and only the pre-training weights of the backbone part are loaded.
Funder
National Natural Science Foundation of China Fund of Jilin Provincial Science and Technology Department
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference58 articles.
1. Zhou, L., Huang, G., Mao, Y., and Wang, S. (2022, January 23–27). Michael Kaess: EDPLVO: Efficient Direct Point-Line Visual Odometry. Proceedings of the Internet Content Rating Association, Philadelphia, PA, USA. 2. Wang, R., Chen, D., Wu, Z., Chen, Y., Dai, X., Liu, M., Jiang, Y.-G., Zhou, L., and Yuan, L. (2022, January 18–24). BEVT: BERT Pretraining of Video Transformers. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA. 3. Tan, S., Wong, K., Wang, S., Manivasagam, S., Ren, M., and Urtasun, R. (2021, January 20–25). Raquel Urtasun: SceneGen: Learning To Generate Realistic Traffic Scenes. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA. 4. Chen, Y., Rong, F., Duggal, S., Wang, S., Yan, X., Manivasagam, S., Xue, S., and Yumer, E. (2021, January 20–25). Raquel Urtasun: GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA. 5. Prakash, A., Chitta, K., and Geiger, A. (2021, January 20–25). Multi-Modal Fusion Transformer for End-to-End Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|