Adaptive Slicing-Aided Hyper Inference for Small Object Detection in High-Resolution Remote Sensing Images
-
Published:2023-02-24
Issue:5
Volume:15
Page:1249
-
ISSN:2072-4292
-
Container-title:Remote Sensing
-
language:en
-
Short-container-title:Remote Sensing
Author:
Zhang Hao1, Hao Chuanyan1ORCID, Song Wanru1, Jiang Bo1ORCID, Li Baozhu2
Affiliation:
1. School of Education Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing 210023, China 2. Internet of Things & Smart City Innovation Platform, Zhuhai Fudan Innovation Institute, Zhuhai 519031, China
Abstract
In the field of object detection, deep learning models have achieved great success in recent years. Despite these advances, detecting small objects remains difficult. Most objects in aerial images have features that are a challenge for traditional object detection techniques, including small size, high density, high variability, and varying orientation. Previous approaches have used slicing methods on high-resolution images or feature maps to improve performance. However, existing slicing methods inevitably lead to redundant computation. Therefore, in this article we present a novel adaptive slicing method named ASAHI (Adaptive Slicing Aided Hyper Inference), which can dramatically reduce redundant computation using an adaptive slicing size. Specifically, ASAHI focuses on the number of slices rather than the slicing size, that is, it adaptively adjusts the slicing size to control the number of slices according to the image resolution. Additionally, we replace the standard non-maximum suppression technique with Cluster-DIoU-NMS due to its improved accuracy and inference speed in the post-processing stage. In extensive experiments, ASAHI achieves competitive performance on the VisDrone and xView datasets. The results show that the mAP50 is increased by 0.9% and the computation time is reduced by 20–25% compared with state-of-the-art slicing methods on the TPH-YOLOV5 pretrained model. On the VisDrone2019-DET-val dataset, our mAP50 result is 56.4% higher, demonstrating the superiority of our approach.
Funder
National Natural Science Foundation of China Shandong Provincial Natural Science Foundation China Postdoctoral Science Foundation
Subject
General Earth and Planetary Sciences
Reference42 articles.
1. Lin, T.Y., Goyal, P., Girshick, R., He, K., and Dollár, P. (2017). Focal Loss for Dense Object Detection. arXiv. 2. Wang, X., Shrivastava, A., and Gupta, A.K. (2017, January 21–26). A-Fast-RCNN: Hard Positive Generation via Adversary for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA. 3. Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. arXiv. 4. Feng, C., Zhong, Y., Gao, Y., Scott, M.R., and Huang, W. (2021). TOOD: Task-aligned One-stage Object Detection. arXiv. 5. Xu, S., Wang, X., Lv, W., Chang, Q., Cui, C., Deng, K., Wang, G., Dang, Q., Wei, S., and Du, Y. (2022). PP-YOLOE: An evolved version of YOLO. arXiv.
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|