Affiliation:
1. School of Computer Science and Engineering, Central South University, Changsha 410083, China
2. EIAS Data Science and Blockchain Laboratory, College of Computer and Information Sciences, Prince Sultan University, Riyadh 11586, Saudi Arabia
3. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China
Abstract
In smart cities, effective traffic congestion management hinges on adept pedestrian and vehicle detection. Unmanned Aerial Vehicles (UAVs) offer a solution with mobility, cost-effectiveness, and a wide field of view, and yet, optimizing recognition models is crucial to surmounting challenges posed by small and occluded objects. To address these issues, we utilize the YOLOv8s model and a Swin Transformer block and introduce the PVswin-YOLOv8s model for pedestrian and vehicle detection based on UAVs. Firstly, the backbone network of YOLOv8s incorporates the Swin Transformer model for global feature extraction for small object detection. Secondly, to address the challenge of missed detections, we opt to integrate the CBAM into the neck of the YOLOv8. Both the channel and the spatial attention modules are used in this addition because of how well they extract feature information flow across the network. Finally, we employ Soft-NMS to improve the accuracy of pedestrian and vehicle detection in occlusion situations. Soft-NMS increases performance and manages overlapped boundary boxes well. The proposed network reduced the fraction of small objects overlooked and enhanced model detection performance. Performance comparisons with different YOLO versions ( for example YOLOv3 extremely small, YOLOv5, YOLOv6, and YOLOv7), YOLOv8 variants (YOLOv8n, YOLOv8s, YOLOv8m, and YOLOv8l), and classical object detectors (Faster-RCNN, Cascade R-CNN, RetinaNet, and CenterNet) were used to validate the superiority of the proposed PVswin-YOLOv8s model. The efficiency of the PVswin-YOLOv8s model was confirmed by the experimental findings, which showed a 4.8% increase in average detection accuracy (mAP) compared to YOLOv8s on the VisDrone2019 dataset.
Reference54 articles.
1. Real-time traffic speed estimation for smart cities with spatial temporal data: A gated graph attention network approach;Nie;Big Data Res.,2022
2. Iftikhar, S., Asim, M., Zhang, Z., Muthanna, A., Chen, J., El-Affendi, M., Sedik, A., and Abd El-Latif, A.A. (2023). Target Detection and Recognition for Traffic Congestion in Smart Cities Using Deep Learning-Enabled UAVs: A Review and Analysis. Appl. Sci., 13.
3. Area and energy efficient shift and accumulator unit for object detection in IoT applications;Hazarika;Alex. Eng. J.,2022
4. Synergistic Integration of Transfer Learning and Deep Learning for Enhanced Object Detection in Digital Images;Waheed;IEEE Access,2024
5. Tian, Z., Shen, C., Chen, H., and He, T. (November, January 27). Fcos: Fully convolutional one-stage object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea.
Cited by
9 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献