Unsupervised single-shot depth estimation using perceptual reconstruction
-
Published:2023-08-11
Issue:5
Volume:34
Page:
-
ISSN:0932-8092
-
Container-title:Machine Vision and Applications
-
language:en
-
Short-container-title:Machine Vision and Applications
Author:
Angermann ChristophORCID, Schwab Matthias, Haltmeier Markus, Laubichler Christian, Jónsson Steinbjörn
Abstract
AbstractReal-time estimation of actual object depth is an essential module for various autonomous system tasks such as 3D reconstruction, scene understanding and condition assessment. During the last decade of machine learning, extensive deployment of deep learning methods to computer vision tasks has yielded approaches that succeed in achieving realistic depth synthesis out of a simple RGB modality. Most of these models are based on paired RGB-depth data and/or the availability of video sequences and stereo images. However, the lack of RGB-depth pairs, video sequences, or stereo images makes depth estimation a challenging task that needs to be explored in more detail. This study builds on recent advances in the field of generative neural networks in order to establish fully unsupervised single-shot depth estimation. Two generators for RGB-to-depth and depth-to-RGB transfer are implemented and simultaneously optimized using the Wasserstein-1 distance, a novel perceptual reconstruction term, and hand-crafted image filters. We comprehensively evaluate the models using a custom-generated industrial surface depth data set as well as the Texas 3D Face Recognition Database, the CelebAMask-HQ database of human portraits and the SURREAL dataset that records body depth. For each evaluation dataset, the proposed method shows a significant increase in depth accuracy compared to state-of-the-art single-image transfer methods.
Funder
Österreichische Forschungsförderungsgesellschaft
Publisher
Springer Science and Business Media LLC
Subject
Computer Science Applications,Computer Vision and Pattern Recognition,Hardware and Architecture,Software
Reference51 articles.
1. Nathan Silberman, P.K. Derek Hoiem, Fergus, R.: Indoor segmentation and support inference from RGBD images. In: ECCV (2012) 2. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving the KITTI vision benchmark suite. In: Conference on Computer Vision and Pattern Recognition (CVPR) (2012) 3. Zhao, C., Sun, Q., Zhang, C., Tang, Y., Qian, F.: Monocular depth estimation based on deep learning: an overview. Sci. China Technol. Sci 63, 1612–1627 (2020) 4. Angermann, C., Haltmeier, M., Laubichler, C., Jónsson, S., Schwab, M., Moravová, A., Kiesling, C., Kober, M., Fimml, W.: Surface topography characterization using a simple optical device and artificial neural networks. Eng. Appl. Artif. Intell. 123, 106337 (2023). https://doi.org/10.1016/j.engappai.2023.106337 5. Laubichler, C., Kiesling, C., Kober, M., Wimmer, A., Angermann, C., Haltmeier, M., Jónsson, S.: Quantitative cylinder liner wear assessment in large internal combustion engines using handheld optical measurement devices and deep learning. In: 18. Tagung Nachhaltigkeit in Mobilität, Transport und Energieerzeugung. IVT Mitteilungen/Reports, pp. 217–231. Verlag der Technischen Universität Graz (2021)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|