AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network-Reference-Cited by-同舟云学术

AE-Qdrop: Towards Accurate and Efficient Low-Bit Post-Training Quantization for A Convolutional Neural Network

Published:2024-02-04 Issue:3 Volume:13 Page:644
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Li Jixing¹²³^ORCID,Chen Gang¹²³,Jin Min¹²³,Mao Wenyu¹²³^ORCID,Lu Huaxiang¹²³

Affiliation:

1. Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China

2. University of Chinese Academy of Sciences, Beijing 100089, China

3. Beijing Key Lab of Semiconductor Neural Network Intelligent Perception and Computing Technology, Beijing 100083, China

Abstract

Blockwise reconstruction with adaptive rounding helps achieve acceptable 4-bit post-training quantization accuracy. However, adaptive rounding is time intensive, and the optimization space of weight elements is constrained to a binary set, thus limiting the performance of quantized models. The optimality of block-wise reconstruction requires that subsequent network blocks remain unquantized. To address this, we propose a two-stage post-training quantization scheme, AE-Qdrop, encompassing block-wise reconstruction and global fine-tuning. In the block-wise reconstruction stage, a progressive optimization strategy is introduced as a replacement for adaptive rounding, enhancing both quantization accuracy and efficiency. Additionally, the integration of randomly weighted quantized activation helps mitigate the risk of overfitting. In the global fine-tuning stage, the weights of each quantized network block are corrected simultaneously through logit matching and feature matching. Experiments in image classification and object detection tasks validate that AE-Qdrop achieves high precision and efficient quantization. For the 2-bit MobileNetV2, AE-Qdrop outperforms Qdrop in quantization accuracy by 6.26%, and its quantization efficiency is fivefold higher.

Funder

National Natural Science Foundation of China

CAS Strategic Leading Science and Technology Project

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/13/3/644/pdf

Reference38 articles.

1. Zhu, A., Wang, B., Xie, J., and Ma, C. (2023). Lightweight Tunnel Defect Detection Algorithm Based on Knowledge Distillation. Electronics, 12.

2. KD-PAR: A knowledge distillation-based pedestrian attribute recognition model with multi-label mixed feature learning network;Wu;Expert Syst. Appl.,2024

3. Manas: Multi-agent neural architecture search;Lopes;Mach. Learn.,2024

4. Song, Y., Wang, A., Zhao, Y., Wu, H., and Iwahori, Y. (2023). Multi-Scale Spatial–Spectral Attention-Based Neural Architecture Search for Hyperspectral Image Classification. Electronics, 12.

5. Li, Y., Adamczewski, K., Li, W., Gu, S., Timofte, R., and Van Gool, L. (2022, January 18–24). Revisiting random channel pruning for neural network compression. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA.