Affiliation:
1. E3DA Unit, Digital Society Center - Fondazione Bruno Kessler (FBK), Trento, Italy
Abstract
In the Internet of Things era, where we see many interconnected and heterogeneous mobile and fixed smart devices, distributing the intelligence from the cloud to the edge has become a necessity. Due to limited computational and communication capabilities, low memory and limited energy budget, bringing artificial intelligence algorithms to peripheral devices, such as end-nodes of a sensor network, is a challenging task and requires the design of innovative solutions. In this work, we present
PhiNets
, a new scalable backbone optimized for deep-learning-based image processing on resource-constrained platforms.
PhiNets
are based on inverted residual blocks specifically designed to decouple the computational cost, working memory, and parameter memory, thus exploiting all available resources for a given platform. With a YoloV2 detection head and Simple Online and Realtime Tracking (SORT), the proposed architecture achieves state-of-the-art results in (i) detection on the COCO and VOC2012 benchmarks, and (ii) tracking on the MOT15 benchmark.
PhiNets
obtain a reduction in parameter count of around 90% with respect to previous state-of-the-art models (EfficientNetv1, MobileNetv2) and achieve better performance with lower computational cost. Moreover, we demonstrate our approach on a prototype node based on an STM32H743 microcontroller (MCU) with 2 MB of internal Flash and 1MB of RAM and achieve power requirements in the order of 10 mW. The code for the
PhiNets
is publicly available on GitHub.
1
Publisher
Association for Computing Machinery (ACM)
Subject
Hardware and Architecture,Software
Reference43 articles.
1. Simple online and realtime tracking
2. Yolov4: Optimal speed and accuracy of object detection;Bochkovskiy Alexey;arXiv:2004.10934.,2020
3. Xnor-net++: Improved binary neural networks;Bulat Adrian;arXiv:1909.13863.,2019
4. Han Cai Chuang Gan and Song Han. 2019. Once for all: Train one network and specialize it for efficient deployment. arXiv:1908.09791. Retrieved from https://arxiv.org/abs/1908.09791.
5. Han Cai Chuang Gan Ligeng Zhu and Song Han. 2020. Tiny transfer learning: Towards memory-efficient on-device learning. arXiv:2007.11622. Retrieved from https://arxiv.org/abs/2007.11622.
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. ZIP-CNN: Design Space Exploration for CNN Implementation within a MCU;ACM Transactions on Embedded Computing Systems;2024-09-04
2. A power-aware vision-based virtual sensor for real-time edge computing;Journal of Real-Time Image Processing;2024-05-30
3. Predicting Time Complexity of TensorFlow Lite Models;2024 IEEE Open Conference of Electrical, Electronic and Information Sciences (eStream);2024-04-25
4. Edge artificial intelligence for big data: a systematic review;Neural Computing and Applications;2024-04-16
5. An empirical evaluation of tinyML architectures for Class-Incremental Continual Learning;2024 IEEE International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events (PerCom Workshops);2024-03-11