MegaDetectNet: A Fast Object Detection Framework for Ultra-High-Resolution Images-Reference-Cited by-同舟云学术

MegaDetectNet: A Fast Object Detection Framework for Ultra-High-Resolution Images

Published:2023-09-05 Issue:18 Volume:12 Page:3737
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Wang Jian¹²,Zhang Yuesong¹,Zhang Fei¹,Li Yazhou¹,Nie Lingcong¹,Zhao Jiale¹

Affiliation:

1. Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650504, China

2. Yunnan Key Lab of Artificial Intelligence, Kunming University of Science and Technology, Kunming 650504, China

Abstract

Addressing the challenge of efficiently detecting objects in ultra-high-resolution images during object detection tasks, this paper proposes a novel method called MegaDetectNet, which leverages foreground image for large-scale resolution image object detection. MegaDetectNet utilizes a foreground extraction network to generate a foreground image that highlights target regions, thus avoiding the computationally intensive process of dividing the image into multiple sub-images for detection, and significantly improving the efficiency of object detection. The foreground extraction network in MegaDetectNet is built upon the YOLOv5 model with modifications: the large object detection head and classifier are removed, and the PConv convolution is introduced to reconstruct the C3 module, thereby accelerating the convolution process and enhancing foreground extraction efficiency. Furthermore, a Res2Rep convolutional structure is developed to enlarge the receptive field and improve the accuracy of foreground extraction. Finally, a foreground image construction method is proposed, fusing and stitching foreground target regions into a unified foreground image. This approach replaces multiple divided sub-images with a single foreground image for detection, reducing overhead time. The proposed MegaDetectNet method’s effectiveness for detecting objects in ultra-high-resolution images is validated using the publicly available DOTA dataset. Experimental results demonstrate that MegaDetectNet achieves an average time reduction of 83.8% compared to the sub-image division method among various commonly used object detectors, with only a marginal 8.7% decrease in mAP (mean Average Precision). This validates the practicality and efficacy of the MegaDetectNet method for object detection in ultra-high-resolution images.

Funder

National Innovation Special Zone Project

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/18/3737/pdf

Reference44 articles.

1. Van Etten, A. (2018). You only look twice: Rapid multi-scale object detection in satellite imagery. arXiv.

2. On improving bounding box representations for oriented object detection;Yao;IEEE Trans. Geosci. Remote Sens.,2022

3. Hou, L., Lu, K., Yang, X., Li, Y., and Xue, J. (2023). G-rep: Gaussian representation for arbitrary-oriented object detection. Remote Sens., 15.

4. Advancing plain vision transformer toward remote sensing foundation model;Wang;IEEE Trans. Geosci. Remote Sens.,2022

5. Wu, Y., and Li, J. (2023). YOLOv4 with Deformable-Embedding-Transformer Feature Extractor for Exact Object Detection in Aerial Imagery. Sensors, 23.