DaCapo: An On-Device Learning Scheme for Memory-Constrained Embedded Systems

Author:

Khan Osama1ORCID,Park Gwanjong1ORCID,Seo Euiseong1ORCID

Affiliation:

1. Sungkyunkwan University, Republic of Korea

Abstract

The use of deep neural network (DNN) applications in microcontroller unit (MCU) embedded systems is getting popular. However, the DNN models in such systems frequently suffer from accuracy loss due to the dataset shift problem. On-device learning resolves this problem by updating the model parameters on-site with the real-world data, thus localizing the model to its surroundings. However, the backpropagation step during on-device learning requires the output of every layer computed during the forward pass to be stored in memory. This is usually infeasible in MCU devices as they are equipped only with a few KBs of SRAM. Given their energy limitation and the timeliness requirements, using flash memory to store the output of every layer is not practical either. Although there have been proposed a few research results to enable on-device learning under stringent memory conditions, they require the modification of the target models or the use of non-conventional gradient computation strategies. This paper proposes DaCapo, a backpropagation scheme that enables on-device learning in memory-constrained embedded systems. DaCapo stores only the output of certain layers, known as checkpoints, in SRAM, and discards the others. The discarded outputs are recomputed during backpropagation from the nearest checkpoint in front of them. In order to minimize the recomputation occurrences, DaCapo optimally plans the checkpoints to be stored in the SRAM area at a particular phase of the backpropagation and thus replaces the checkpoints stored in memory as the backpropagation progresses. We implemented the proposed scheme in an STM32F429ZI board and evaluated it with five representative DNN models. Our evaluation showed that DaCapo improved backpropagation time by up to 22% and saved energy consumption by up to 28% in comparison to AIfES, a machine learning platform optimized for MCU devices. In addition, our proposed approach enabled the training of MobileNet, which the MCU device had been previously unable to train.

Funder

Institute of Information and Communications Technology Planning and Evaluation

Development of Core Technology for Autonomous Energy-driven Computing System SW in Power-Instable Environment

National Research Foundation of Korea

Publisher

Association for Computing Machinery (ACM)

Subject

Hardware and Architecture,Software

Reference40 articles.

1. 2022. AIfES: Aritifical Intelligence for Embedded Systems. https://github.com/Fraunhofer-IMS/AIfES_for_Arduino

2. 2022. FOMO: Faster Objects More Objects. https://docs.edgeimpulse.com/docs/edge-impulse-studio/learning-blocks/object-detection/fomo-object-detection-for-constrained-devices

3. MLPerf tiny benchmark;Banbury Colby;Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks,2021

4. Micronets: Neural network architectures for deploying tinyml applications on commodity microcontrollers;Banbury Colby;Proceedings of Machine Learning and Systems,2021

5. Stable Electromyographic Sequence Prediction During Movement Transitions using Temporal Convolutional Networks

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3