Affiliation:
1. School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710129, China
2. School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
3. College of Information Engineering, Northwest A&F University, Xianyang 712100, China
Abstract
The widespread application of convolutional neural networks (CNNs) has led to significant advancements in object detection. However, challenges remain in achieving efficient and precise extraction of critical features when applying typical CNN-based methods to remote sensing detection tasks: (1) The convolutional kernels sliding horizontally in the backbone are misaligned with the features of arbitrarily oriented objects. Additionally, the detector shares the features extracted from the backbone, but the classification task requires orientation-invariant features while the regression task requires orientation-sensitive features. The inconsistency in feature requirements makes it difficult for the detector to extract the critical features required for each task. (2) The use of deeper convolutional structures can improve the detection accuracy, but it also results in substantial convolutional computations and feature redundancy, leading to inefficient feature extraction. To address this issue, we propose a Task-Sensitive Efficient Feature Extraction Network (TFE-Net). Specifically, we propose a special mixed fast convolution module for constructing an efficient network architecture that employs cheap transform operations to replace some of the convolution operations, generating more features with fewer parameters and computation resources. Next, we introduce the task-sensitive detection module, which first aligns the convolutional features with the targets using adaptive dynamic convolution based on the orientation of the targets. The task-sensitive feature decoupling mechanism is further designed to extract orientation-sensitive features and orientation-invariant features from the aligned features and feed them into the regression and classification branches, respectively, which provide the critical features needed for different tasks, thus improving the detection performance comprehensively. In addition, in order to make the training process more stable, we propose a balanced loss function to balance the gradients generated by different samples. Extensive experiments demonstrate that our proposed TFE-Net can achieve superior performance and obtain an effective balance between detection speed and accuracy on DOTA, UCAS-AOD, and HRSC2016.
Funder
Aviation Science Foundation
Reference63 articles.
1. LO-Det: Lightweight Oriented Object Detection in Remote Sensing Images;Huang;IEEE Trans. Geosci. Remote Sens.,2022
2. Zhao, B., Zhao, B., Tang, L., Han, Y., and Wang, W. (2018). Deep spatial-temporal joint feature representation for video object detection. Sensors, 18.
3. DNNBoT: Deep neural network-based botnet detection and classification;Haq;Comput. Mater. Contin.,2022
4. Spatial–spectral image classification with edge preserving method;Merugu;J. Indian Soc. Remote Sens.,2021
5. Haq, M. (2023). DBoTPM: A Deep Neural Network-Based Botnet Prediction Model. Electronics, 12.