Multispectral Object Detection Based on Multilevel Feature Fusion and Dual Feature Modulation-Reference-Cited by-同舟云学术

Multispectral Object Detection Based on Multilevel Feature Fusion and Dual Feature Modulation

Published:2024-01-21 Issue:2 Volume:13 Page:443
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Sun Jin¹,Yin Mingfeng¹^ORCID,Wang Zhiwei¹,Xie Tao¹^ORCID,Bei Shaoyi¹

Affiliation:

1. School of Automobile and Traffic Engineering, Jiangsu University of Technology, Changzhou 213001, China

Abstract

Multispectral object detection is a crucial technology in remote sensing image processing, particularly in low-light environments. Most current methods extract features at a single scale, resulting in the fusion of invalid features and the failure to detect small objects. To address these issues, we propose a multispectral object detection network based on multilevel feature fusion and dual feature modulation (GMD-YOLO). Firstly, a novel dual-channel CSPDarknet53 network is used to extract deep features from visible-infrared images. This network incorporates a Ghost module, which generates additional feature maps through a series of linear operations, achieving a balance between accuracy and speed. Secondly, the multilevel feature fusion (MLF) module is designed to utilize cross-modal information through the construction of hierarchical residual connections. This approach strengthens the complementarity between different modalities, allowing the network to improve multiscale representation capabilities at a more refined granularity level. Finally, a dual feature modulation (DFM) decoupling head is introduced to enhance small object detection. This decoupled head effectively meets the distinct requirements of classification and localization tasks. GMD-YOLO is validated on three public visible-infrared datasets: DroneVehicle, KAIST, and LLVIP. DroneVehicle and LLVIP achieved mAP@0.5 of 78.0% and 98.0%, outperforming baseline methods by 3.6% and 4.4%, respectively. KAIST exhibited an MR of 7.73% with an FPS of 61.7. Experimental results demonstrated that our method surpasses existing advanced methods and exhibits strong robustness.

Funder

National Natural Science Foundation of China

Natural Science Research Project of Colleges and Universities of Jiangsu Province

Changzhou Applied Basic Research Project

Publisher

MDPI AG

Link

https://www.mdpi.com/2079-9292/13/2/443/pdf

Reference53 articles.

1. Object detection with deep learning: A review;Zhao;IEEE Trans. Neural Netw. Learn. Syst.,2019

2. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C., and Berg, A.C. (2016, January 11–14). SSD: Single shot multibox detector. Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands.

3. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.

4. Singh, A., Bhambhu, Y., Buckchash, H., Gupta, D.K., and Prasad, D.K. (2023). Latent Graph Attention for Enhanced Spatial Context. arXiv.

5. Biswas, M., Buckchash, H., and Prasad, D.K. (2023). pNNCLR: Stochastic Pseudo Neighborhoods for Contrastive Learning based Unsupervised Representation Learning Problems. arXiv.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An Infrared Aircraft Detection Algorithm Based on Context Perception Feature Enhancement;Electronics;2024-07-10