An Approach of Binary Neural Network Energy-Efficient Implementation

Author:

Gao JiabaoORCID,Liu Qingliang,Lai Jinmei

Abstract

Binarized neural networks (BNNs), which have 1-bit weights and activations, are well suited for FPGA accelerators as their dominant computations are bitwise arithmetic, and the reduction in memory requirements means that all the network parameters can be stored in internal memory. However, the energy efficiency of these accelerators is still restricted by the abundant redundancies in BNNs. This hinders their deployment for applications in smart sensors and tiny devices because these scenarios have tight constraints with respect to energy consumption. To overcome this problem, we propose an approach to implement BNN inference while offering excellent energy efficiency for the accelerators by means of pruning the massive redundant operations while maintaining the original accuracy of the networks. Firstly, inspired by the observation that the convolution processes of two related kernels contain many repeated computations, we first build one formula to clarify the reusing relationships between their convolutional outputs and remove the unnecessary operations. Furthermore, by generalizing this reusing relationship to one tile of kernels in one neuron, we adopt an inclusion pruning strategy to further skip the superfluous evaluations of the neurons whose real output values can be determined early. Finally, we evaluate our system on the Zynq 7000 XC7Z100 FPGA platform. Our design can prune 51 percent of the operations without any accuracy loss. Meanwhile, the energy efficiency of our system is as high as 6.55 × 105 Img/kJ, which is 118× better than the best accelerator based on an NVDIA Tesla-V100 GPU and 3.6× higher than the state-of-the-art FPGA implementations for BNNs.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Latent Weight-Based Pruning for Small Binary Neural Networks;Proceedings of the 28th Asia and South Pacific Design Automation Conference;2023-01-16

2. A Systematic Literature Review on Binary Neural Networks;IEEE Access;2023

3. A TinyML based Residual Binarized Neural Network for real-time Image Classification;2022 6th International Conference on Electronics, Communication and Aerospace Technology;2022-12-01

4. Towards High Performance and Accurate BNN Inference on FPGA with Structured Fine-Grained Pruning;Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design;2022-10-30

5. High-Performance and Robust Binarized Neural Network Accelerator Based on Modified Content-Addressable Memory;Electronics;2022-09-03

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3