Optimizing Slender Target Detection in Remote Sensing with Adaptive Boundary Perception
-
Published:2024-07-19
Issue:14
Volume:16
Page:2643
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Zhu Han1, Jing Donglin2ORCID
Affiliation:
1. College of Sciences, Civil Aviation Flight University of China, Guanghan 618307, China 2. School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
Abstract
Over the past few years, target detectors that utilize Convolutional Neural Networks have gained extensive application in the domain of remote sensing (RS) imagery. Recently, optimizing bounding boxes has consistently been a hot topic in the research field. However, existing methods often fail to take into account the interference caused by the shape and orientation changes of RS targets with high aspect ratios during training, leading to challenges in boundary perception when dealing with RS targets that have large aspect ratios. To deal with this challenge, our study introduces the Adaptive Boundary Perception Network (ABP-Net), a novel two-stage approach consisting of pre-training and training phases, which enhances the boundary perception of CNN-based detectors. In the pre-training phase, involving the initialization of our model’s backbone network and the label assignment, the traditional label assignment with a fixed IoU threshold fails to fully cover the critical information of slender targets, resulting in the detector missing lots of high-quality positive samples. To overcome this drawback, we design a Shape-Sensitive (S-S) label assignment strategy that can improve the boundary shape perception by dynamically adjusting the IoU threshold according to the aspect ratios of the targets so that the high-quality samples with critical features can be divided into positive samples. Moreover, during the training phase, minor angle differences of the slender bounding box may cause a significant change in the value of the loss function, producing unstable gradients. Such drastic gradient changes make it difficult for the model to find a stable update direction when optimizing the bounding box parameters, resulting in difficulty with the model convergence. To this end, we propose the Robust–Refined loss function (R-R), which can enhance the boundary localization perception by focusing on low-error samples and suppressing the gradient amplification of difficult samples, thereby improving the model stability and convergence. Experiments on UCAS-AOD and HRSC2016 datasets validate our specialized detector for high-aspect-ratio targets, improving performance, efficiency, and accuracy with straightforward operation and quick deployment.
Reference34 articles.
1. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., and Berg, A.C. (2016, January 11–14). SSD: Single shot multibox detector. Proceedings of the 14th European Conference of the Computer Vision (ECCV 2016), Amsterdam, The Netherlands. 2. Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollar, P. (2017, January 22–29). Focal Loss for Dense Object Detection. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy. 3. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (July, January 26). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA. 4. Redmon, J., and Farhadi, A. (2017, January 21–26). YOLO9000: Better, Faster, Stronger. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 5. Redmon, J., and Farhadi, A. (2018). YOLOv3: An Incremental Improvement. arXiv.
|
|