A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification

Author:

Rokh Babak1ORCID,Azarpeyvand Ali2ORCID,Khanteymoori Alireza3ORCID

Affiliation:

1. Department of Computer Engineering, University of Zanjan, Zanjan, Iran

2. Department of Electrical and Computer Engineering, University of Zanjan, Zanjan, Iran

3. Neurozentrum Department, Universitätsklinikum Freiburg, Freiburg, Germany

Abstract

Recent advancements in machine learning achieved by Deep Neural Networks (DNNs) have been significant. While demonstrating high accuracy, DNNs are associated with a huge number of parameters and computations, which leads to high memory usage and energy consumption. As a result, deploying DNNs on devices with constrained hardware resources poses significant challenges. To overcome this, various compression techniques have been widely employed to optimize DNN accelerators. A promising approach is quantization, in which the full-precision values are stored in low bit-width precision. Quantization not only reduces memory requirements but also replaces high-cost operations with low-cost ones. DNN quantization offers flexibility and efficiency in hardware design, making it a widely adopted technique in various methods. Since quantization has been extensively utilized in previous works, there is a need for an integrated report that provides an understanding, analysis, and comparison of different quantization approaches. Consequently, we present a comprehensive survey of quantization concepts and methods, with a focus on image classification. We describe clustering-based quantization methods and explore the use of a scale factor parameter for approximating full-precision values. Moreover, we thoroughly review the training of a quantized DNN, including the use of a straight-through estimator and quantization regularization. We explain the replacement of floating-point operations with low-cost bitwise operations in a quantized DNN and the sensitivity of different layers in quantization. Furthermore, we highlight the evaluation metrics for quantization methods and important benchmarks in the image classification task. We also present the accuracy of the state-of-the-art methods on CIFAR-10 and ImageNet. This article attempts to make the readers familiar with the basic and advanced concepts of quantization, introduce important works in DNN quantization, and highlight challenges for future research in this field.

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Theoretical Computer Science

Reference191 articles.

1. ImageNet classification with deep convolutional neural networks;Krizhevsky Alex;Adv. Neural Inf. Process. Syst.,2012

2. Going deeper with convolutions

3. Alexis Conneau Holger Schwenk Loıc Barrault and Yann Lecun. 2016. Very deep convolutional networks for natural language processing. arXiv:1606.0178 1 vol. 2 http://arxiv.org/abs/1606.01781

4. Xiaodong Liu Pengcheng He Weizhu Chen and Jianfeng Gao. 2019. Improving multi-task deep neural networks via knowledge distillation for natural language understanding. arXiv:1904.09482 . http://arxiv.org/abs/1904.09482

5. Xiaodong Liu Yu Wang Jianshu Ji Hao Cheng Xueyun Zhu Emmanuel Awa Pengcheng He Weizhu Chen Hoifung Poon Guihong Cao and Jianfeng Gao. 2020. The microsoft toolkit of multi-task deep neural networks for natural language understanding. arXiv:2002.07972 2020. http://arxiv.org/abs/2002.07972

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3