FPGA Implementation of CNN Based DepthEstimation Network: MiDaSNet-Reference-Cited by-同舟云学术

FPGA Implementation of CNN Based DepthEstimation Network: MiDaSNet

Published:2024-06-07 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Adiyaman Muhammed Yasin¹,Baskaya Ismail Faik¹

Affiliation:

1. Boğaziçi University

Abstract

Real-time depth estimation is crucial in many vision-related tasks, including autonomous driving, 3D reconstruction, and SLAM applications. In recent years, many methods have been proposed to solve depth maps from images by utilizing different modality setups like monocular vision, binocular vision, or sensor fusion. However, for real-time deployment on edge devices, complex methods are not suitable due to latency constraints and the limited computation capacity. For edge implementation, the models should be simple, minimal in size, and also hardware-friendly. Considering these factors, we implemented MiDaSNet, which works on the simplest setup of monocular vision and utilizes hardware-friendly CNN-based architecture, for real-time depth estimation on the edge. Besides, since the model is trained on diverse datasets, it shows stable performance across different mediums. For edge implementation, we quantized the model weights down to an 8-bit fixed-point representation. Then, we deployed the quantized model on an inexpensive FPGA card, Kria KV260, utilizing predefined deep-learning processing units embedded in the programmable logic. The results show that our quantized model achieves 82.6% zero-shot accuracy on the NYUv2 dataset with 50.7 fps inference speed on the card.

Publisher

Springer Science and Business Media LLC

Reference20 articles.

1. Figueredo, A. J. and Wolf, P. S. A. (2009) Assortative pairing and life history strategy -- a cross-cultural study. Human Nature 20: 317-330 https://doi.org/https://doi.org/10.1007/s12110-009-9068-2

2. Hao, Z. and AghaKouchak, A. and Nakhjiri, N. and Farahmand, A. Global integrated drought monitoring and prediction system ({GIDMaPS}) data sets. figshare http://dx.doi.org/10.6084/m9.figshare.853801, 2014

3. Y. Sada and N. Soga and M. Shimoda and A. Jinguji and S. Sato and H. Nakahara (2020) Fast Monocular Depth Estimation on an FPGA. New Orleans, LA, USA, Field programmable gate arrays;Estimation;Graphics processing units;Quantization (signal);Power demand;System-on-chip;Hardware, 10.1109/IPDPSW50202.2020.00032, 143-146, 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

4. S. Liu and S. Zhao and P. Zhang and et al. (2022) Real-time monocular depth estimation for low-power embedded systems using deep learning. J Real-Time Image Proc 19: 997-1006 https://doi.org/10.1007/s11554-022-01237-9

5. D. ZiWen and L. YuQi and Y. Dong (2023) FasterMDE: A real-time monocular depth estimation search method that balances accuracy and speed on the edge. Appl Intell 53: 24566-24586 https://doi.org/10.1007/s10489-023-04872-2