A Uniform Architecture Design for Accelerating 2D and 3D CNNs on FPGAs-Reference-Cited by-同舟云学术

A Uniform Architecture Design for Accelerating 2D and 3D CNNs on FPGAs

Published:2019-01-07 Issue:1 Volume:8 Page:65
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Liu Zhiqiang,Chow Paul^ORCID,Xu Jinwei,Jiang Jingfei,Dou Yong,Zhou Jie

Abstract

Three-dimensional convolutional neural networks (3D CNNs) have gained popularity in many complicated computer vision applications. Many customized accelerators based on FPGAs are proposed for 2D CNNs, while very few are for 3D CNNs. Three-D CNNs are far more computationally intensive and the design space for 3D CNN acceleration has been further expanded since one more dimension is introduced, making it a big challenge to accelerate 3D CNNs on FPGAs. Motivated by the finding that the computation patterns of 2D and 3D CNNs are very similar, we propose a uniform architecture design for accelerating both 2D and 3D CNNs in this paper. The uniform architecture is based on the idea of mapping convolutions to matrix multiplications. A customized mapping module is developed to generate the feature matrix tilings with no need to store the entire enlarged feature matrix on-chip or off-chip, a splitting strategy is adopted to reconstruct a convolutional layer to adapt to the on-chip memory capacity, and a 2D multiply-and-accumulate (MAC) array is adopted to compute matrix multiplications efficiently. For demonstration, we implement an accelerator prototype with a high-level synthesis (HLS) methodology on a Xilinx VC709 board and test the accelerator on three typical CNN models: AlexNet, VGG16, and C3D. Experimental results show that the accelerator achieves state-of-the-art throughput performance on both 2D and 3D CNNs, with much better energy efficiency than the CPU and GPU.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/8/1/65/pdf

Reference23 articles.

Cited by 48 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Review of neural network model acceleration techniques based on FPGA platforms;Neurocomputing;2024-12

2. ConvLSNet: A lightweight architecture based on ConvLSTM model for the classification of pulmonary conditions using multichannel lung sound recordings;Artificial Intelligence in Medicine;2024-08

3. Fpga-based SoC design for real-time facial point detection using deep convolutional neural networks with dynamic partial reconfiguration;Signal, Image and Video Processing;2024-05-14

4. A Novel FPGA Accelerator of R(2+1)D;2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM);2024-05-05

5. Exploring Memory Access Techniques for Efficient FPGA based 3D CNN Accelerator Design;2024 IEEE 6th International Conference on AI Circuits and Systems (AICAS);2024-04-22