FPGA Hardware Acceleration Research and Implementation of Deep Learning Algorithms-Reference-Cited by-同舟云学术

FPGA Hardware Acceleration Research and Implementation of Deep Learning Algorithms

Published:2023-11-05 Issue:3 Volume:5 Page:146-149
ISSN:2832-6024
Container-title:Frontiers in Computing and Intelligent Systems
language:
Short-container-title:FCIS

Author:

Hu Yuxuan

Abstract

The convolutional neural network model is an important algorithm for deep learning, and YOLOv3-tiny based on this model has excellent object detection ability. However, the computational power required by the model is still large, and it is difficult to realize the application in the embedded field. This paper proposes a hardware acceleration method for YOLOv3-tiny and implements it on FPGA platform. Firstly, the fixed-point quantitative processing was carried out for the network, and an appropriate fixed-point strategy was designed with the data accuracy as the index. Secondly, the parallel computing design and pipeline optimization principle were carried out, and the FIFO structure was introduced to shorten the running time. Finally, the experiment was carried out on the Xilinx PYNQ-Z2 platform, and the data were compared with the previous related work.

Publisher

Darcy & Roy Press Co. Ltd.

Reference11 articles.

1. Zhang, C., Li, P., Sun, G.Y., Guan, Y.J., Xiao, B.J., Cong, J. (2015) Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks. International Symposium on Field-Programmable Gate Arrays (FPGA), 161-170.

2. Sun, F., Wang, C., Gong, L., Xu, C., Zhou, X. (2017) A High-Performance Accelerator for Large-Scale Convolutional Neural Networks. 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 1-9.

3. Venieris, S.I., Bouganis, C.S. (2016) FPGAConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs. IEEE International Symposium on Field-Programmable Custom Computing Machines, London, UK, 40-47.

4. Redmon, J., Divvala, S., Girshick, R., Farhadi, A. (2016) You only look once: unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 779-788.

5. Ren, S., He, K., Girshick, R., Sun, J. (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(6): 1137-1149.