Keep in Balance: Runtime-reconfigurable Intermittent Deep Inference-Reference-Cited by-同舟云学术

Keep in Balance: Runtime-reconfigurable Intermittent Deep Inference

Published:2023-09-09 Issue:5s Volume:22 Page:1-25
ISSN:1539-9087
Container-title:ACM Transactions on Embedded Computing Systems
language:en
Short-container-title:ACM Trans. Embed. Comput. Syst.

Author:

Yen Chih-Hsuan¹^ORCID,Mendis Hashan Roshantha²^ORCID,Kuo Tei-Wei³^ORCID,Hsiu Pi-Cheng⁴^ORCID

Affiliation:

1. National Taiwan University and Academia Sinica, Taiwan

2. Academia Sinica, Taiwan

3. National Taiwan University, Taiwan and Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates

4. Academia Sinica, National Taiwan University and National Chi Nan University, Taiwan

Abstract

Intermittent deep neural network (DNN) inference is a promising technique to enable intelligent applications on tiny devices powered by ambient energy sources. Nonetheless, intermittent execution presents inherent challenges, primarily involving accumulating progress across power cycles and having to refetch volatile data lost due to power loss in each power cycle. Existing approaches typically optimize the inference configuration to maximize data reuse. However, we observe that such a fixed configuration may be significantly inefficient due to the fluctuating balance point between data reuse and data refetch caused by the dynamic nature of ambient energy. This work proposes DynBal , an approach to dynamically reconfigure the inference engine at runtime. DynBal is realized as a middleware plugin that improves inference performance by exploring the interplay between data reuse and data refetch to maintain their balance with respect to the changing level of intermittency. An indirect metric is developed to easily evaluate an inference configuration considering the variability in intermittency, and a lightweight reconfiguration algorithm is employed to efficiently optimize the configuration at runtime. We evaluate the improvement brought by integrating DynBal into a recent intermittent inference approach that uses a fixed configuration. Evaluations were conducted on a Texas Instruments device with various network models and under varied intermittent power strengths. Our experimental results demonstrate that DynBal can speed up intermittent inference by 3.26 times, achieving a greater improvement for a large network under high intermittency and a large gap between memory and computation performance.

Funder

Ministry of Science and Technology, Taiwan

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3607918

Reference54 articles.

1. Ehsan Aghapour, Dolly Sapra, Andy Pimentel, and Anuj Pathania. 2022. CPU-GPU layer-switched low latency CNN inference. In Proc. of DSD. 324–331.

2. Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra Perez, and Jorge Luis Reyes Ortiz. 2013. A public domain dataset for human activity recognition using smartphones. In Proc. of ESANN. 437–442.

3. ARM. 2010. Cortex-M4 instructions. https://developer.arm.com/documentation/ddi0439/b/CHDDIGAC

4. Benchmarking TinyML systems: Challenges and direction;Banbury Colby R.;arXiv:2003.04821,2020

5. Jongouk Choi, Larry Kittinger, Qingrui Liu, and Changhee Jung. 2022. Compiler-directed high-performance intermittent computation with power failure immunity. In Proc. of IEEE RTAS. 40–54.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. LACT: Liveness-Aware Checkpointing to reduce checkpoint overheads in intermittent systems;Journal of Systems Architecture;2024-08