FPGA Implementation of a Deep Learning Acceleration Core Architecture for Image Target Detection
-
Published:2023-03-24
Issue:7
Volume:13
Page:4144
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Yang Xu1ORCID, Zhuang Chen2ORCID, Feng Wenquan1, Yang Zhe1, Wang Qiang1
Affiliation:
1. School of Electronic & Information Engineering, Beihang University, Beijing 100080, China 2. Hefei Innovation Research Institute of Beihang University, Hefei 230012, China
Abstract
Due to the flexibility and ease of deployment of Field Programmable Gate Arrays (FPGA), more and more studies have been conducted on developing and optimizing target detection algorithms based on Convolutional Neural Networks (CNN) models using FPGAs. Still, these studies focus on improving the performance of the core algorithm and optimizing hardware structure, with few studies focusing on the unified architecture design and corresponding optimization techniques for the algorithm model, resulting in inefficient overall model performance. The essential reason is that these studies do not address arithmetic power, speed, and resource consistency. In order to solve this problem, we propose a deep learning acceleration core architecture based on FPGAs, which is designed for target detection algorithms with CNN models, using multi-channel parallelization of CNN network models to improve the arithmetic power, using scheduling tasks and intensive computation pipelining to meet the algorithm’s data bandwidth requirements and unifying the speed and area of the orchestrated computation matrix to save hardware resources. The proposed framework achieves 14 Frames Per Second (FPS) inference performance of the TinyYolo model at 5 Giga Operations Per Second (GOPS) with 30% higher running clock frequency, 2–4 times higher arithmetic power, and 28% higher Digital Signal Processing (DSP) resource utilization efficiency using less than 25% of FPGA resource usage.
Funder
National Natural Science Foundation of China
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference35 articles.
1. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 2. Faster r-cnn: Towards real-time object detection with region proposal networks;Ren;Adv. Neural Inf. Process. Syst.,2015 3. Sun, B., Wang, X., Oad, A., Pervez, A., and Dong, F. (2023). Automatic Ship Object Detection Model Based on YOLOv4 with Transformer Mechanism in Remote Sensing Images. Appl. Sci., 13. 4. Sun, Z., Leng, X., Lei, Y., Xiong, B., Ji, K., and Kuang, G. (2021). BiFA-YOLO: A novel YOLO-based method for arbitrary-oriented ship detection in high-resolution SAR images. Remote Sens., 13. 5. Hu, J., Zhi, X., Shi, T., Zhang, W., Cui, Y., and Zhao, S. (2021). PAG-YOLO: A portable attention-guided YOLO network for small ship detection. Remote Sens., 13.
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|