Enhancing Real-time Target Detection in Smart Cities: YOLOv8-DSAF Insights-Reference-Cited by-同舟云学术

Enhancing Real-time Target Detection in Smart Cities: YOLOv8-DSAF Insights

Published:2024-01-29 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Li Yihong¹,Huang Yanrong²,Tao Qi³

Affiliation:

1. Zhaoqing University

2. Zhejiang University of Water Resource and Electric Power

3. South China Normal University

Abstract

With the global rise of smart city construction, target detection technology plays a crucial role in optimizing urban functions and improving the quality of life. However, existing target detection technologies still have shortcomings in terms of accuracy, real-time performance, and adaptability. To address this challenge, this study proposes an innovative target detection model. Our model adopts the structure of YOLOv8-DSAF. The model comprises three key modules: Depthwise Separable Convolution (DSConv), Dual-Path Attention Gate module (DPAG), and Feature Enhancement Module (FEM). Firstly, DSConv technology optimizes computational complexity, enabling real-time target detection within limited hardware resources. Secondly, the DPAG module introduces a dual-channel attention mechanism, allowing the model to selectively focus on crucial areas, thereby improving detection accuracy in high-dynamic traffic scenarios. Finally, the FEM module highlights crucial features to prevent their loss, further enhancing detection accuracy. Experimental results on the KITTI V and Cityscapes datasets indicate that our model outperforms the YOLOv8 model. This suggests that in complex urban traffic scenarios, our model exhibits superior performance with higher detection accuracy and adaptability. We believe that this innovative model will significantly propel the development of smart cities and advance target detection technology.

Publisher

Springer Science and Business Media LLC

Reference38 articles.

1. Vehicle Detection and Classification via YOLOv8 and Deep Belief Network over Aerial Image Sequences;Al Mudawi N;Sustainability,2023

2. A CNN accelerator on FPGA using depthwise separable convolution;Bai L;IEEE Transactions on Circuits and Systems II: Express Briefs,2018

3. Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. Proceedings of the IEEE/CVF international conference on computer vision,

4. Bui, P. H. D., Nguyen, T. T., Nguyen, T. M., & Nguyen, H. T. (2023). An Approach for Traffic Sign Recognition with Versions of YOLO. International Conference on Intelligent Systems and Data Science,

5. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. European conference on computer vision,