Bit-Weight Adjustment for Bridging Uniform and Non-Uniform Quantization to Build Efficient Image Classifiers
-
Published:2023-12-18
Issue:24
Volume:12
Page:5043
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Zhou Xichuan1, Duan Yunmo1, Ding Rui1, Wang Qianchuan2, Wang Qi3, Qin Jian1, Liu Haijun1
Affiliation:
1. School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, China 2. Alibaba (China) Co., Ltd., Hangzhou 311100, China 3. Dingding (China) Information Technology Co., Ltd., Hangzhou 311100, China
Abstract
Network quantization, which strives to reduce the precision of model parameters and/or features, is one of the most efficient ways to accelerate model inference and reduce memory consumption, particularly for deep models when performing a variety of real-time vision tasks on edge platforms with constrained resources. Existing quantization approaches function well when using relatively high bit widths but suffer from a decline in accuracy at ultra-low precision. In this paper, we propose a bit-weight adjustment (BWA) module to bridge uniform and non-uniform quantization, successfully quantizing the model to ultra-low bit widths without bringing about noticeable performance degradation. Given uniformly quantized data, the BWA module adaptively transforms these data into non-uniformly quantized data by simply introducing trainable scaling factors. With the BWA module, we combine uniform and non-uniform quantization in a single network, allowing low-precision networks to benefit from both the hardware friendliness of uniform quantization and the high performance of non-uniform quantization. We optimize the proposed BWA module by directly minimizing the classification loss through end-to-end training. Numerous experiments on the ImageNet and CIFAR-10 datasets reveal that the proposed approach outperforms state-of-the-art approaches across various bit-width settings and can even produce low-precision quantized models that are competitive with their full-precision counterparts.
Funder
National Natural Science Foundation of China CCF-Alibaba Innovative Research Fund For Young Scholars Fundamental Research Funds for the Central Universities
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference41 articles.
1. He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27–30). Deep Residual Learning for Image Recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 2. He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 11–14). Identity mappings in deep residual networks. Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands. 3. Condés, I., Fernández-Conde, J., Perdices, E., and Cañas, J.M. (2023). Robust Person Identification and Following in a Mobile Robot Based on Deep Learning and Optical Tracking. Electronics, 12. 4. Zhang, R., Zhu, Z., Li, L., Bai, Y., and Shi, J. (2023). BFE-Net: Object Detection with Bidirectional Feature Enhancement. Electronics, 12. 5. Liang, C., Yang, J., Du, R., Hu, W., and Tie, Y. (2023). Non-Uniform Motion Aggregation with Graph Convolutional Networks for Skeleton-Based Human Action Recognition. Electronics, 12.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|