Resource-demand Estimation for Edge Tensor Processing Units-Reference-Cited by-同舟云学术

Resource-demand Estimation for Edge Tensor Processing Units

Published:2022-09-30 Issue:5 Volume:21 Page:1-24
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Herzog Benedict¹^ORCID,Reif Stefan²,Hemp Judith²,Hönig Timo¹,Schröder-Preikschat Wolfgang²

Affiliation:

1. Ruhr-Universität Bochum (RUB), Germany

2. Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), Germany

Abstract

Machine learning has shown tremendous success in a large variety of applications. The evolution of machine-learning applications from cloud-based systems to mobile and embedded devices has shifted the focus from only quality-related aspects towards the resource demand of machine learning. For embedded systems, dedicated accelerator hardware promises the energy-efficient execution of neural network inferences. Their precise resource demand in terms of execution time and power demand, however, is undocumented. Developers, therefore, face the challenge to fine-tune their neural networks such that their resource demand matches the available budgets. This article presents Precious , a comprehensive approach to estimate the resource demand of an embedded neural network accelerator. We generate randomised neural networks, analyse them statically, execute them on an embedded accelerator while measuring their actual power draw and execution time, and train estimators that map the statically analysed neural network properties to the measured resource demand. In addition, this article provides an in-depth analysis of the neural networks’ resource demands and the responsible network properties. We demonstrate that the estimation error of Precious can be below 1.5% for both power draw and execution time. Furthermore, we discuss what estimator accuracy is practically achievable and how much effort is required to achieve sufficient accuracy.

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3520132

Reference62 articles.

1. Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A system for large-scale machine learning. In Proceedings of the 12th Symposium on Operating Systems Design and Implementation (OSDI’16). USENIX, 265–283.

2. Low-Power Computer Vision: Status, Challenges, and Opportunities

3. Structured Pruning of Deep Convolutional Neural Networks

4. Random search for hyper-parameter optimization;Bergstra James;J. Mach. Learn. Res.,2012

5. Bfloat16 Processing for Neural Networks

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Hybrid Framework Leveraging Whale Optimization and Deep Learning With Trust-Index for Attack Identification in IoT Networks;IEEE Access;2024

2. Bears: Building Energy-Aware Reconfigurable Systems;2022 XII Brazilian Symposium on Computing Systems Engineering (SBESC);2022-11-21