Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization-Reference-Cited by-同舟云学术

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

Published:2020-07 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Niu Wei¹,Zhao Pu²,Zhan Zheng²,Lin Xue²,Wang Yanzhi²,Ren Bin¹

Affiliation:

1. College of William and Mary

2. Northeastern University

Abstract

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications. However, the constrained computation and storage resources on these devices still pose significant challenges for real-time DNN inference executions. To address this problem, we propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices. This demo shows that these optimizations can enable real-time mobile execution of multiple DNN applications, including style transfer, DNN coloring and super resolution.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Towards Highly Compressed CNN Models for Human Activity Recognition in Wearable Devices;2023 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA);2023-09-20

2. Distributed Artificial Intelligence Empowered by End-Edge-Cloud Computing: A Survey;IEEE Communications Surveys & Tutorials;2023

3. HiTDL: High-Throughput Deep Learning Inference at the Hybrid Mobile Edge;IEEE Transactions on Parallel and Distributed Systems;2022-12-01