Affiliation:
1. School of Electronic Information, Northwestern Polytechnical University, Xi’an 710072, China
2. Department of Avionics, Military Technical College, Cairo 4393010, Egypt
3. School of Automation, Northwestern Polytechnical University, Xi’an 710072, China
Abstract
Object detection for remote sensing is a fundamental task in image processing of remote sensing; as one of the core components, small or tiny object detection plays an important role. Despite the considerable advancements achieved in small object detection with the integration of CNN and transformer networks, there remains untapped potential for enhancing the extraction and utilization of information associated with small objects. Particularly within transformer structures, this potential arises from the disregard of the complex and the intertwined interplay between spatial context information and channel information during the global modeling of pixel-level information within small objects. As a result, valuable information is prone to being obfuscated and annihilated. To mitigate this limitation, we propose an innovative framework, YOLO-DCTI, that capitalizes on the Contextual Transformer (CoT) framework for the detection of small or tiny objects. Specifically, within CoT, we seamlessly incorporate global residuals and local fusion mechanisms throughout the entire input-to-output pipeline. This integration facilitates a profound investigation into the network’s intrinsic representations at deeper levels and fosters the fusion of spatial contextual attributes with channel characteristics. Moreover, we propose an improved decoupled contextual transformer detection head structure, denoted as DCTI, to effectively resolve the feature conflicts that ensue from the concurrent classification and regression tasks. The experimental results on the Dota, VISDrone, and NWPU VHR-10 datasets show that, on the powerful real-time detection network YOLOv7, the speed and accuracy of tiny targets are better balanced.
Funder
National Natural Science Foundation of China
Fundamental Research Funds for the Central Universities, the Postdoctoral Science Foundation of China
the Fourth Special Grant of China Postdoctoral Science Foundation
Subject
General Earth and Planetary Sciences
Reference56 articles.
1. Multiple instance graph learning for weakly supervised remote sensing object detection;Wang;IEEE Trans. Geosci. Remote Sens.,2021
2. Deep learning-based detection from the perspective of tiny objects: A survey;Tong;Image Vis. Comput.,2022
3. CDD-Net: A context-driven detection network for multiclass object detection;Wu;IEEE Geosci. Remote Sens. Lett.,2020
4. YOLOv5-Tassel: Detecting tassels in RGB UAV imagery with improved YOLOv5 is based on transfer learning;Liu;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2022
5. Li, J., Liang, X., Wei, Y., Xu, T., Feng, J., and Yan, S. (2017, January 21–26). Perceptual generative adversarial networks for small object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献