Application Level Resource Scheduling for Deep Learning Acceleration on MPSoC-Reference-Cited by-同舟云学术

Application Level Resource Scheduling for Deep Learning Acceleration on MPSoC

Published:2023-07-18 Issue: Volume: Page:
ISSN:1939-8018
Container-title:Journal of Signal Processing Systems
language:en
Short-container-title:J Sign Process Syst

Author:

Gao Cong,Saha Sangeet,Zhu Xuqi,Jing Hongyuan,McDonald-Maier Klaus D.,Zhai Xiaojun^ORCID

Abstract

AbstractDeep Neutral Networks (DNNs) have been widely used in many applications, such as self-driving cars, natural language processing (NLP), image classification, visual object recognition, and so on. Field-programmable gate array (FPGA) based Multiprocessor System on a Chip (MPSoC) is recently considered one of the popular choices for deploying DNN models. However, the limited resource capacity of MPSoC imposes a challenge for such practical implementation. Recent studies revealed the trade-off between the “resources consumed" vs. the “performance achieved". Taking a cue from these findings, we address the problem of efficient implementation of deep learning into the resource-constrained MPSoC in this paper, where each deep learning network is run with different service levels based on resource usage (where a higher service level implies higher performance with increased resource consumption). To this end, we propose a heuristic-based strategy, Application Wise Level Selector (AWLS), for selecting service levels to maximize the overall performance subject to a given resource bound. AWLS can achieve higher performance within a constrained resource budget under various simulation scenarios. Further, we verify the proposed strategy using an AMD-Xilinx Zynq UltraScale+ XCZU9EG SoC. Using a framework designed to deploy multi-DNN on multi-DPUs (Deep Learning Units), it is proved that an optimal solution is achieved from the algorithm, which obtains the highest performance (Frames Per Second) using the same resource budget.

Funder

Engineering and Physical Sciences Research Council

Publisher

Springer Science and Business Media LLC

Subject

Hardware and Architecture,Modeling and Simulation,Information Systems,Signal Processing,Theoretical Computer Science,Control and Systems Engineering

Link

https://link.springer.com/content/pdf/10.1007/s11265-023-01881-9.pdf

Reference19 articles.

1. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.

2. Dutta, L., & Bharali, S. (2021). TinyML meets IoT: A comprehensive survey. Internet of Things, 16, 100461.

3. Cai, H., Gan, C., Wang, T., Zhang, Z., & Han, S. (2019). Once-for-all: Train one network and specialize it for efficient deployment. Preprint retrieved from http://arxiv.org/abs/1908.09791

4. Lou, W., Xun, L., Sabet, A., Bi, J., Hare, J., & Merrett, G. V. (2021). Dynamic-OFA: Runtime DNN architecture switching for performance scaling on heterogeneous embedded platforms. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 3110–3118).

5. Korol, G., Jordan, M. G., Rutzig, M. B., & Beck, A. C. S. (2022). AdaFlow: A framework for adaptive dataflow CNN acceleration on FPGAs. In 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE) (pp. 244–249). IEEE.