Affiliation:
1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504, China
2. Yunnan Key Lab of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650504, China
Abstract
Addressing the challenge of efficiently detecting objects in ultra-high-resolution images during object detection tasks, this paper proposes a novel method called MegaDetectNet, which leverages foreground image for large-scale resolution image object detection. MegaDetectNet utilizes a foreground extraction network to generate a foreground image that highlights target regions, thus avoiding the computationally intensive process of dividing the image into multiple sub-images for detection, and significantly improving the efficiency of object detection. The foreground extraction network in MegaDetectNet is built upon the YOLOv5 model with modifications: the large object detection head and classifier are removed, and the PConv convolution is introduced to reconstruct the C3 module, thereby accelerating the convolution process and enhancing foreground extraction efficiency. Furthermore, a Res2Rep convolutional structure is developed to enlarge the receptive field and improve the accuracy of foreground extraction. Finally, a foreground image construction method is proposed, fusing and stitching foreground target regions into a unified foreground image. This approach replaces multiple divided sub-images with a single foreground image for detection, reducing overhead time. The proposed MegaDetectNet method’s effectiveness for detecting objects in ultra-high-resolution images is validated using the publicly available DOTA dataset. Experimental results demonstrate that MegaDetectNet achieves an average time reduction of 83.8% compared to the sub-image division method among various commonly used object detectors, with only a marginal 8.7% decrease in mAP (mean Average Precision). This validates the practicality and efficacy of the MegaDetectNet method for object detection in ultra-high-resolution images.
Funder
National Innovation Special Zone Project
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference44 articles.
1. Van Etten, A. (2018). You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv.
2. On improving bounding box representations for oriented object detection;Yao;IEEE Trans. Geosci. Remote Sens.,2022
3. Hou, L., Lu, K., Yang, X., Li, Y., and Xue, J. (2023). G-rep: Gaussian representation for arbitrary-oriented object detection. Remote Sens., 15.
4. Advancing plain vision transformer toward remote sensing foundation model;Wang;IEEE Trans. Geosci. Remote Sens.,2022
5. Wu, Y., and Li, J. (2023). YOLOv4 with Deformable-Embedding-Transformer Feature Extractor for Exact Object Detection in Aerial Imagery. Sensors, 23.