RNA: A Flexible and Efficient Accelerator Based on Dynamically Reconfigurable Computing for Multiple Convolutional Neural Networks-Reference-Cited by-同舟云学术

RNA: A Flexible and Efficient Accelerator Based on Dynamically Reconfigurable Computing for Multiple Convolutional Neural Networks

Published:2022-07-07 Issue:16 Volume:31 Page:
ISSN:0218-1266
Container-title:Journal of Circuits, Systems and Computers
language:en
Short-container-title:J CIRCUIT SYST COMP

Author:

Yang Chen¹^ORCID,Hou Jia¹,Wang Yizhou¹,Zhang Haibo¹,Wang Xiaoli¹,Geng Li¹

Affiliation:

1. School of Microelectronics, Xi’an Jiaotong University, No. 28, Xianning West Road, Beilin District, Xi’an, Shaanxi 710049, P. R. China

Abstract

The increasingly complicated and versatile convolutional neural networks (CNNs) models bring challenges to hardware acceleration in terms of performance, energy efficiency and flexibility. This paper proposes a reconfigurable neural accelerator (RNA) for flexible and efficient CNN acceleration. To provide hardware flexibility, RNA employs dynamically reconfigurable computing framework to rapidly configure data path between processing elements (PE) at run-time, as well as an interlaced data access mechanism for multi-bank RAM. To achieve high energy efficiency, three optimization mechanisms, including image row broadcasting dataflow (IRBD), tile-by-tile computing (TTC), and zero detection technology (ZDT), are dedicatedly designed for RNA to exploit data reuse and decrease memory bandwidth requirement, which is the key to improving performance and saving power consumption. To save hardware overhead, an online dynamic adaptive data truncation (DADT) mechanism is designed to compensate accuracy loss of multiplication results so that the computational precision in RNA can be reduced from 16-bit to 8-bit, which contributes to reducing the area of data path. The RNA architecture is implemented on Xilinx XC7Z100 FPGA and works at 250[Formula: see text]MHz. Experimental results show that the performance of running LeNet, AlexNet and VGG are 500 GOPS, 598 GOPS and 660 GOPS, respectively. Compared to previous FPGA-based designs, RNA achieves [Formula: see text] performance speedup and [Formula: see text] improvements on energy efficiency.

Funder

National Natural Science Foundation of China

Publisher

World Scientific Pub Co Pte Ltd

Subject

Electrical and Electronic Engineering,Hardware and Architecture,Electrical and Electronic Engineering,Hardware and Architecture

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218126622502899

Reference58 articles.

1. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

2. Feature Extraction of Colorectal Endoscopic Images for Computer-Aided Diagnosis with CNN

3. Novel Deep Learning Model with CNN and Bi-Directional LSTM for Improved Stock Market Index Prediction

4. Wasserstein CNN: Learning Invariant Features for NIR-VIS Face Recognition

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Novel Two-Level Protection Scheme against Hardware Trojans on a Reconfigurable CNN Accelerator;Cryptography;2024-08-04

2. Hardware Trojan Attacks on the Reconfigurable Interconnections of Field-Programmable Gate Array-Based Convolutional Neural Network Accelerators and a Physically Unclonable Function-Based Countermeasure Detection Technique;Micromachines;2024-01-19