Author:
Li Yuhang,Dong Xin,Zhang Sai Qian,Bai Haoli,Chen Yuanpeng,Wang Wei
Abstract
To deploy deep neural networks on resource-limited devices, quantization has been widely explored. In this work, we study the extremely low-bit networks which have tremendous speed-up, memory saving with quantized activation and weights. We first bring up three omitted issues in extremely low-bit networks: the squashing range of quantized values; the gradient vanishing during backpropagation and the unexploited hardware acceleration of ternary networks. By reparameterizing quantized activation and weights vector with full precision scale and offset for fixed ternary vector, we decouple the range and magnitude from direction to extenuate above problems. Learnable scale and offset can automatically adjust the range of quantized values and sparsity without gradient vanishing. A novel encoding and computation pattern are designed to support efficient computing for our reparameterized ternary network (RTN). Experiments on ResNet-18 for ImageNet demonstrate that the proposed RTN finds a much better efficiency between bitwidth and accuracy and achieves up to 26.76% relative accuracy improvement compared with state-of-the-art methods. Moreover, we validate the proposed computation pattern on Field Programmable Gate Arrays (FPGA), and it brings 46.46 × and 89.17 × savings on power and area compared with the full precision convolution.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Robust Binary Encoding for Ternary Neural Networks Toward Deployment on Emerging Memory;2024 International Joint Conference on Neural Networks (IJCNN);2024-06-30
2. Quantization of Neural Networks;Computational Intelligence Methods and Applications;2024
3. BISDU: A Bit-Serial Dot-Product Unit for Microcontrollers;ACM Transactions on Embedded Computing Systems;2023-09-26
4. iMAT: Energy-Efficient In-Memory Acceleration for Ternary Neural Networks With Sparse Dot Product;2023 IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED);2023-08-07
5. Advanced Binary Neural Network for Single Image Super Resolution;International Journal of Computer Vision;2023-04-18