Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework-Reference-Cited by-同舟云学术

Mobile or FPGA? A Comprehensive Evaluation on Energy Efficiency and a Unified Optimization Framework

Published:2022-09-30 Issue:5 Volume:21 Page:1-22
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Yuan Geng¹,Dong Peiyan¹,Sun Mengshu¹,Niu Wei²,Li Zhengang¹,Cai Yuxuan¹,Li Yanyu¹,Liu Jun³,Jiang Weiwen⁴,Lin Xue¹,Ren Bin²,Tang Xulong⁵,Wang Yanzhi¹

Affiliation:

1. Northeastern University, Boston, Massachusetts, USA

2. College of William and Mary, USA

3. Carnegie Mellon University, USA

4. University of Notre Dame, USA

5. University of Pittsburgh, USA

Abstract

Efficient deployment of Deep Neural Networks (DNNs) on edge devices (i.e., FPGAs and mobile platforms) is very challenging, especially under a recent witness of the increasing DNN model size and complexity. Model compression strategies, including weight quantization and pruning, are widely recognized as effective approaches to significantly reduce computation and memory intensities, and have been implemented in many DNNs on edge devices. However, most state-of-the-art works focus on ad hoc optimizations, and there lacks a thorough study to comprehensively reveal the potentials and constraints of different edge devices when considering different compression strategies. In this article, we qualitatively and quantitatively compare the energy efficiency of FPGA-based and mobile-based DNN executions using mobile GPU and provide a detailed analysis. Based on the observations obtained from the analysis, we propose a unified optimization framework using block-based pruning to reduce the weight storage and accelerate the inference speed on mobile devices and FPGAs, achieving high hardware performance and energy-efficiency gain while maintaining accuracy.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3528578

Reference85 articles.

1. TensorFlow. [n.d.]. Retrieved from https://www.tensorflow.org/mobile/tflite/.

2. Qualcomm. [n.d.]. Retrieved from https://www.qualcomm.com/products/snapdragon-865-plus-5g-mobile-platform.

3. On optimizing machine learning workloads via kernel fusion

4. A CNN Accelerator on FPGA Using Depthwise Separable Convolution

5. Julia: A Fresh Approach to Numerical Computing