Enhancing Small Object Detection in Aerial Images: A Novel Approach with PCSG Model
-
Published:2024-05-14
Issue:5
Volume:11
Page:392
-
ISSN:2226-4310
-
Container-title:Aerospace
-
language:en
-
Short-container-title:Aerospace
Author:
An Kang1, Duanmu Huiping1ORCID, Wu Zhiyang1, Liu Yuqiang1, Qiao Jingzhen1, Shangguan Qianqian1, Song Yaqing1ORCID, Xu Xiaonong1
Affiliation:
1. The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai 201418, China
Abstract
Generalized target detection algorithms perform well for large- and medium-sized targets but struggle with small ones. However, with the growing importance of aerial images in urban transportation and environmental monitoring, detecting small targets in such imagery has been a promising research hotspot. The challenge in small object detection lies in the limited pixel proportion and the complexity of feature extraction. Moreover, current mainstream detection algorithms tend to be overly complex, leading to structural redundancy for small objects. To cope with these challenges, this paper recommends the PCSG model based on yolov5, which optimizes both the detection head and backbone networks. (1) An enhanced detection header is introduced, featuring a new structure that enhances the feature pyramid network and the path aggregation network. This enhancement bolsters the model’s shallow feature reuse capability and introduces a dedicated detection layer for smaller objects. Additionally, redundant structures in the network are pruned, and the lightweight and versatile upsampling operator CARAFE is used to optimize the upsampling algorithm. (2) The paper proposes the module named SPD-Conv to replace the strided convolution operation and pooling structures in yolov5, thereby enhancing the backbone’s feature extraction capability. Furthermore, Ghost convolution is utilized to optimize the parameter count, ensuring that the backbone meets the real-time needs of aerial image detection. The experimental results from the RSOD dataset show that the PCSG model exhibits superior detection performance. The value of mAP increases from 97.1% to 97.8%, while the number of model parameters decreases by 22.3%, from 1,761,871 to 1,368,823. These findings unequivocally highlight the effectiveness of this approach.
Funder
National Natural Science Foundation of China
Reference27 articles.
1. Naranjo, M., Fuentes, D., Muelas, E., Díez, E., Ciruelo, L., Alonso, C., Abenza, E., Gómez-Espinosa, R., and Luengo, I. (2023). Object Detection-Based System for Traffic Signs on Drone-Captured Images. Drones, 7. 2. Zebedin, L., Bauer, J., Karner, K., and Bischof, H. (2008). Computer Vision—ECCV 2008, Marseille, France, 12–18 October 2008. Lecture Notes in Computer Science, Springer. 3. Unmanned Aircraft Systems in Remote Sensing and Scientific Research: Classification and Considerations of Use;Watts;Remote Sens.,2012 4. Elliptic Fourier Transformation-Based Histograms of Oriented Gradients for Rotationally Invariant Object Detection in Remote-Sensing Images;Xiao;Int. J. Remote Sens.,2015 5. Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges;Ding;IEEE Trans. Pattern Anal. Mach. Intell.,2022
|
|