Affiliation:
1. Department of Computer, Control and Management Engineering, Sapienza University of Rome, Via Ariosto 25, 00185 Rome, Italy
Abstract
The knowledge of environmental depth is essential in multiple robotics and computer vision tasks for both terrestrial and underwater scenarios. Moreover, the hardware on which this technology runs, generally IoT and embedded devices, are limited in terms of power consumption, and therefore, models with a low-energy footprint are required to be designed. Recent works aim at enabling depth perception using single RGB images on deep architectures, such as convolutional neural networks and vision transformers, which are generally unsuitable for real-time inferences on low-power embedded hardware. Moreover, such architectures are trained to estimate depth maps mainly on terrestrial scenarios due to the scarcity of underwater depth data. Purposely, we present two lightweight architectures based on optimized MobileNetV3 encoders and a specifically designed decoder to achieve fast inferences and accurate estimations over embedded devices, a feasibility study to predict depth maps over underwater scenarios, and an energy assessment to understand which is the effective energy consumption during the inference. Precisely, we propose the MobileNetV3S75 configuration to infer on the 32-bit ARM CPU and the MobileNetV3LMin for the 8-bit Edge TPU hardware. In underwater settings, the proposed design achieves comparable estimations with fast inference performances compared to state-of-the-art methods. Moreover, we statistically proved that the architecture of the models has an impact on the energy footprint in terms of Watts required by the device during the inference. Then, the proposed architectures would be considered to be a promising approach for real-time monocular depth estimation by offering the best trade-off between inference performances, estimation error and energy consumption, with the aim of improving the environment perception for underwater drones, lightweight robots and Internet of things.
Funder
Sapienza University of Rome
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference37 articles.
1. Papa, L., Russo, P., and Amerini, I. (2022, January 3–5). Real-time monocular depth estimation on embedded devices: Challenges and performances in terrestrial and underwater scenarios. Proceedings of the 2022 IEEE International Workshop on Metrology for the Sea, Milazzo, Italy. Learning to Measure Sea Health Parameters (MetroSea).
2. Li, Z., Chen, Z., Liu, X., and Jiang, J. (2022). DepthFormer: Exploiting Long-Range Correlation and Local Information for Accurate Monocular Depth Estimation. arXiv.
3. Ranftl, R., Bochkovskiy, A., and Koltun, V. (2021). Vision Transformers for Dense Prediction. arXiv.
4. Bhat, S.F., Alhashim, I., and Wonka, P. (2020). AdaBins: Depth Estimation using Adaptive Bins. arXiv.
5. Alhashim, I., and Wonka, P. (2019). High Quality Monocular Depth Estimation via Transfer Learning. arXiv.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献