A BNN Accelerator Based on Edge-skip-calculation Strategy and Consolidation Compressed Tree-Reference-Cited by-同舟云学术

A BNN Accelerator Based on Edge-skip-calculation Strategy and Consolidation Compressed Tree

Published:2022-05-10 Issue:3 Volume:15 Page:1-20
ISSN:1936-7406
Container-title:ACM Transactions on Reconfigurable Technology and Systems
language:en
Short-container-title:ACM Trans. Reconfigurable Technol. Syst.

Author:

Du Gaoming¹^ORCID,Chen Bangyi¹,Li Zhenmin¹,Tu Zhenxing¹,Zhou Junjie²,Wang Shenya¹,Zhao Qinghao¹,Yin Yongsheng¹,Wang Xiaolei¹

Affiliation:

1. Institute of VLSI Design, Hefei University of Technology, Hefei, China and IC Design Cooperative Research Center of Ministry of Education, Hefei, China

2. Division of Automated Driving, Chery Automobile Co., Ltd., Wuhu, China

Abstract

Binarized neural networks (BNNs) and batch normalization (BN) have already become typical techniques in artificial intelligence today. Unfortunately, the massive accumulation and multiplication in BNN models bring challenges to field-programmable gate array (FPGA) implementations, because complex arithmetics in BN consume too much computing resources. To relax FPGA resource limitations and speed up the computing process, we propose a BNN accelerator architecture based on consolidation compressed tree scheme by combining both XNOR and accumulation operation of the low bit into a systematic one. During the compression process, we adopt 0-padding (not ±1) to achieve no-accuracy-loss from software modeling to hardware implementation. Moreover, we introduce shift-addition-BN free binarization technique to shorten the delay path and optimize on-chip storage. To sum up, we drastically cut down the hardware consumption while maintaining great speed performance with the same model complexity as the previous design. We evaluate our accelerator on MNIST and CIFAR-10 dataset and implement the whole system on the ARTIX-7 100T FPGA with speed performance of 2052.65 GOP/s and area efficiency of 70.15 GOPS/KLUT.

Funder

National Key Research and Development Program

University Synergy Innovation Program of Anhui Province

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3494569

Reference31 articles.

1. YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights

2. Torch7: A matlab-like environment for machine learning;Collobert Ronan;BigLearn, NIPS workshop,2011

3. Matthieu Courbariaux and Yoshua Bengio. 2016. BinaryNet: Training deep neural networks with weights and activations constrained to +1 or \( -1 \) . Retrieved from https://arXiv:1602.02830.

4. Memory access optimized routing scheme for deep networks on a mobile coprocessor

5. Tree Structure Network: A Learning-Based Deep Network for Classification of CPU Instruction through EM Signal

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhancing Bnn Storage and Performance Requirements Via Efficient Quantization and Variable Encoding;2024

2. RESEARCH ON IDENTIFICATION OF CROP LEAF PESTS AND DISEASES BASED ON FEW-SHOT LEARNING;Engenharia Agrícola;2023-12

3. Parallel Symbiotic Random Number Generator for Training Tsetlin Machines on FPGA;2023 International Symposium on the Tsetlin Machine (ISTM);2023-08-29

4. Exploiting Kernel Compression on BNNs;2023 Design, Automation & Test in Europe Conference & Exhibition (DATE);2023-04