Keep in Balance: Runtime-reconfigurable Intermittent Deep Inference

Author:

Yen Chih-Hsuan1ORCID,Mendis Hashan Roshantha2ORCID,Kuo Tei-Wei3ORCID,Hsiu Pi-Cheng4ORCID

Affiliation:

1. National Taiwan University and Academia Sinica, Taiwan

2. Academia Sinica, Taiwan

3. National Taiwan University, Taiwan and Mohamed bin Zayed University of Artificial Intelligence, United Arab Emirates

4. Academia Sinica, National Taiwan University and National Chi Nan University, Taiwan

Abstract

Intermittent deep neural network (DNN) inference is a promising technique to enable intelligent applications on tiny devices powered by ambient energy sources. Nonetheless, intermittent execution presents inherent challenges, primarily involving accumulating progress across power cycles and having to refetch volatile data lost due to power loss in each power cycle. Existing approaches typically optimize the inference configuration to maximize data reuse. However, we observe that such a fixed configuration may be significantly inefficient due to the fluctuating balance point between data reuse and data refetch caused by the dynamic nature of ambient energy. This work proposes DynBal , an approach to dynamically reconfigure the inference engine at runtime. DynBal is realized as a middleware plugin that improves inference performance by exploring the interplay between data reuse and data refetch to maintain their balance with respect to the changing level of intermittency. An indirect metric is developed to easily evaluate an inference configuration considering the variability in intermittency, and a lightweight reconfiguration algorithm is employed to efficiently optimize the configuration at runtime. We evaluate the improvement brought by integrating DynBal into a recent intermittent inference approach that uses a fixed configuration. Evaluations were conducted on a Texas Instruments device with various network models and under varied intermittent power strengths. Our experimental results demonstrate that DynBal can speed up intermittent inference by 3.26 times, achieving a greater improvement for a large network under high intermittency and a large gap between memory and computation performance.

Funder

Ministry of Science and Technology, Taiwan

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Reference54 articles.

1. Ehsan Aghapour, Dolly Sapra, Andy Pimentel, and Anuj Pathania. 2022. CPU-GPU layer-switched low latency CNN inference. In Proc. of DSD. 324–331.

2. Davide Anguita, Alessandro Ghio, Luca Oneto, Xavier Parra Perez, and Jorge Luis Reyes Ortiz. 2013. A public domain dataset for human activity recognition using smartphones. In Proc. of ESANN. 437–442.

3. ARM. 2010. Cortex-M4 instructions. https://developer.arm.com/documentation/ddi0439/b/CHDDIGAC

4. Benchmarking TinyML systems: Challenges and direction;Banbury Colby R.;arXiv:2003.04821,2020

5. Jongouk Choi, Larry Kittinger, Qingrui Liu, and Changhee Jung. 2022. Compiler-directed high-performance intermittent computation with power failure immunity. In Proc. of IEEE RTAS. 40–54.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3