Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications-Reference-Cited by-同舟云学术

Enable Deep Learning on Mobile Devices: Methods, Systems, and Applications

Published:2022-03-04 Issue:3 Volume:27 Page:1-50
ISSN:1084-4309
Container-title:ACM Transactions on Design Automation of Electronic Systems
language:en
Short-container-title:ACM Trans. Des. Autom. Electron. Syst.

Author:

Cai Han¹,Lin Ji¹,Lin Yujun¹,Liu Zhijian¹,Tang Haotian¹,Wang Hanrui¹,Zhu Ligeng¹,Han Song¹

Affiliation:

1. Massachusetts Institute of Technology, Cambridge, MA, USA

Abstract

Deep neural networks (DNNs) have achieved unprecedented success in the field of artificial intelligence (AI), including computer vision, natural language processing, and speech recognition. However, their superior performance comes at the considerable cost of computational complexity, which greatly hinders their applications in many resource-constrained devices, such as mobile phones and Internet of Things (IoT) devices. Therefore, methods and techniques that are able to lift the efficiency bottleneck while preserving the high accuracy of DNNs are in great demand to enable numerous edge AI applications. This article provides an overview of efficient deep learning methods, systems, and applications. We start from introducing popular model compression methods, including pruning, factorization, quantization, as well as compact model design. To reduce the large design cost of these manual solutions, we discuss the AutoML framework for each of them, such as neural architecture search (NAS) and automated pruning and quantization. We then cover efficient on-device training to enable user customization based on the local data on mobile devices. Apart from general acceleration techniques, we also showcase several task-specific accelerations for point cloud, video, and natural language processing by exploiting their spatial sparsity and temporal/token redundancy. Finally, to support all these algorithmic advancements, we introduce the efficient deep learning system design from both software and hardware perspectives.

Publisher

Association for Computing Machinery (ACM)

Subject

Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Science Applications

Link

https://dl.acm.org/doi/pdf/10.1145/3486618

Reference366 articles.

1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In USENIX Symposium on Operating Systems Design and Implementation.

2. A2P-MANN: Adaptive attention inference hops pruned memory-augmented neural networks;Ahmadzadeh Mohsen;arXiv preprint arXiv:2101.09693,2021

3. Syntactically Supervised Transformers for Faster Neural Machine Translation

4. Jorge Albericio, Patrick Judd, Tayler H. Hetherington, Tor M. Aamodt, Natalie D. Enright Jerger, and Andreas Moshovos. 2016. Cnvlutin: Ineffectual-neuron-free deep neural network computing. In International Symposium on Computer Architecture.

5. Variational End-to-End Navigation and Localization

Cited by 52 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Rapid and portable quantification of HIV RNA via a smartphone-enabled digital CRISPR device and deep learning;Sensors and Actuators Reports;2024-12

2. Flexi-BOPI: Flexible granularity pipeline inference with Bayesian optimization for deep learning models on HMPSoC;Information Sciences;2024-09

3. Socially-Aware Tile-Based Point Cloud Multicast with Registration;ICC 2024 - IEEE International Conference on Communications;2024-06-09

4. Anomaly detection of defect using energy of point pattern features within random finite set framework;Engineering Applications of Artificial Intelligence;2024-04

5. A collective AI via lifelong learning and sharing at the edge;Nature Machine Intelligence;2024-03-22