SRE-YOLOv8: An Improved UAV Object Detection Model Utilizing Swin Transformer and RE-FPN

Author:

Li Jun12,Zhang Jiajie12,Shao Yanhua3,Liu Feng45ORCID

Affiliation:

1. Artificial Intelligence Security Innovation Research, Beijing Information Science and Technology University, Beijing 100192, China

2. Department of Information Security, Beijing Information Science and Technology University, Beijing 100192, China

3. National Computer System Engineering Research Institute of China, Beijing 100083, China

4. School of Computer Science and Technology, East China Normal University, Shanghai 200062, China

5. Shanghai International School of Chief Technology Officer, East China Normal University, Shanghai 200062, China

Abstract

To tackle the intricate challenges associated with the low detection accuracy of images taken by unmanned aerial vehicles (UAVs), arising from the diverse sizes and types of objects coupled with limited feature information, we present the SRE-YOLOv8 as an advanced method. Our method enhances the YOLOv8 object detection algorithm by leveraging the Swin Transformer and a lightweight residual feature pyramid network (RE-FPN) structure. Firstly, we introduce an optimized Swin Transformer module into the backbone network to preserve ample global contextual information during feature extraction and to extract a broader spectrum of features using self-attention mechanisms. Subsequently, we integrate a Residual Feature Augmentation (RFA) module and a lightweight attention mechanism named ECA, thereby transforming the original FPN structure to RE-FPN, intensifying the network’s emphasis on critical features. Additionally, an SOD (small object detection) layer is incorporated to enhance the network’s ability to recognize the spatial information of the model, thus augmenting accuracy in detecting small objects. Finally, we employ a Dynamic Head equipped with multiple attention mechanisms in the object detection head to enhance its performance in identifying low-resolution targets amidst complex backgrounds. Experimental evaluation conducted on the VisDrone2021 dataset reveals a significant advancement, showcasing an impressive 9.2% enhancement over the original YOLOv8 algorithm.

Funder

Basic Research Project of the National Defense Science and Industry Bureau

Publisher

MDPI AG

Reference52 articles.

1. Recent advances in drone-view object detection;Leng;J. Image Graph.,2023

2. Zhang, Q., Zhang, H., and Lu, X. (2022). Adaptive Feature Fusion for Small Object Detection. Appl. Sci., 12.

3. Small object detection in aerial images based on feature aggregation and multiple cooperative features interaction;Chen;J. Electron. Meas. Instrum.,2023

4. Distinctive image features from scale-invariant key-points;Lowe;Int. J. Comput. Vis.,2004

5. Dalal, N., and Triggs, B. (2005, January 20–25). Histograms of oriented gradients for human detection. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA.

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3