Any-Precision Deep Neural Networks-Reference-Cited by-同舟云学术

Any-Precision Deep Neural Networks

Published:2021-05-18 Issue:12 Volume:35 Page:10763-10771
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Yu Haichao,Li Haoxiang,Shi Humphrey,Huang Thomas S.,Hua Gang

Abstract

We present any-precision deep neural networks (DNNs), which are trained with a new method that allows the learned DNNs to be flexible in numerical precision during inference. The same model in runtime can be flexibly and directly set to different bit-widths, by truncating the least significant bits, to support dynamic speed and accuracy trade-off. When all layers are set to low-bits, we show that the model achieved accuracy comparable to dedicated models trained at the same precision. This nice property facilitates flexible deployment of deep learning models in real-world applications, where in practice trade-offs between model accuracy and runtime efficiency are often sought. Previous literature presents solutions to train models at each individual fixed efficiency/accuracy trade-off point. But how to produce a model flexible in runtime precision is largely unexplored. When the demand of efficiency/accuracy trade-off varies from time to time or even dynamically changes in runtime, it is infeasible to re-train models accordingly, and the storage budget may forbid keeping multiple models. Our proposed framework achieves this flexibility without performance degradation. More importantly, we demonstrate that this achievement is agnostic to model architectures and applicable to multiple vision tasks. Our code is released at https://github.com/SHI-Labs/Any-Precision-DNNs.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 14 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Low-Precision Mixed-Computation Models for Inference on Edge;IEEE Transactions on Very Large Scale Integration (VLSI) Systems;2024-08

2. Computational Complexity Optimization of Neural Network-Based Equalizers in Digital Signal Processing: A Comprehensive Approach;Journal of Lightwave Technology;2024-06-15

3. Adapting Neural Networks at Runtime: Current Trends in At-Runtime Optimizations for Deep Learning;ACM Computing Surveys;2024-05-14

4. Towards Better Structured Pruning Saliency by Reorganizing Convolution;2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV);2024-01-03

5. Attention Round for post-training quantization;Neurocomputing;2024-01