A Heterogeneous Inference Framework for a Deep Neural Network
-
Published:2024-01-14
Issue:2
Volume:13
Page:348
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Gadea-Gironés Rafael1ORCID, Rocabado-Rocha José Luís1ORCID, Fe Jorge1ORCID, Monzo Jose M.1ORCID
Affiliation:
1. Institute for Molecular Imaging Technologies (I3M), Universitat Politècnica de València, 46022 Valencia, Spain
Abstract
Artificial intelligence (AI) is one of the most promising technologies based on machine learning algorithms. In this paper, we propose a workflow for the implementation of deep neural networks. This workflow attempts to combine the flexibility of high-level compilers (HLS)-based networks with the architectural control features of hardware description languages (HDL)-based flows. The architecture consists of a convolutional neural network, SqueezeNet v1.1, and a hard processor system (HPS) that coexists with acceleration hardware to be designed. This methodology allows us to compare solutions based solely on software (PyTorch 1.13.1) and propose heterogeneous inference solutions, taking advantage of the best options within the software and hardware flow. The proposed workflow is implemented on a low-cost field programmable gate array system-on-chip (FPGA SOC) platform, specifically the DE10-Nano development board. We have provided systolic architectural solutions written in OpenCL that are highly flexible and easily tunable to take full advantage of the resources of programmable devices and achieve superior energy efficiencies working with a 32-bit floating point. From a verification point of view, the proposed method is effective, since the reference models in all tests, both for the individual layers and the complete network, have been readily available using packages well known in the development, training, and inference of deep networks.
Funder
Ministry of Science, Innovation and Universities (MCIU) of Spain
Reference19 articles.
1. Edge AI: A survey;Singh;Internet Things Cyber-Phys. Syst.,2023 2. Iandola, F.N., Han, S., Moskewicz, M.W., Ashraf, K., Dally, W.J., and Keutzer, K. (2016). SqueezeNet: AlexNet-level accuracy with 50× fewer parameters and <0.5 MB model size. arXiv, Available online: http://arxiv.org/abs/1602.07360. 3. Lee, H.J., Ullah, I., Wan, W., Gao, Y., and Fang, Z. (2019). Real-Time Vehicle Make and Model Recognition with the Residual SqueezeNet Architecture. Sensors, 19. 4. Kwaghe, O.P., Gital, A.Y., Madaki, A., Abdulrahman, M.L., Yakubu, I.Z., and Shima, I.S. (2022, January 18–20). A Deep Learning Approach for Detecting Face Mask Using an Improved Yolo-V2 With Squeezenet. Proceedings of the 2022 IEEE 6th Conference on Information and Communication Technology (CICT), Gwalior, India. 5. Fang, C., Lv, C., Cai, F., Liu, H., Wang, J., and Shuai, M. (2020, January 13–15). Weather Classification for Outdoor Power Monitoring based on Improved SqueezeNet. Proceedings of the 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT), Shenyang, China.
|
|