Fast Data Generation for Training Deep-Learning 3D Reconstruction Approaches for Camera Arrays
-
Published:2023-12-27
Issue:1
Volume:10
Page:7
-
ISSN:2313-433X
-
Container-title:Journal of Imaging
-
language:en
-
Short-container-title:J. Imaging
Author:
Barrios Théo1ORCID, Prévost Stéphanie1ORCID, Loscos Céline1ORCID
Affiliation:
1. LICIIS Laboratory, University of Reims Champagne-Ardenne, 51100 Reims, France
Abstract
In the last decade, many neural network algorithms have been proposed to solve depth reconstruction. Our focus is on reconstruction from images captured by multi-camera arrays which are a grid of vertically and horizontally aligned cameras that are uniformly spaced. Training these networks using supervised learning requires data with ground truth. Existing datasets are simulating specific configurations. For example, they represent a fixed-size camera array or a fixed space between cameras. When the distance between cameras is small, the array is said to be with a short baseline. Light-field cameras, with a baseline of less than a centimeter, are for instance in this category. On the contrary, an array with large space between cameras is said to be of a wide baseline. In this paper, we present a purely virtual data generator to create large training datasets: this generator can adapt to any camera array configuration. Parameters are for instance the size (number of cameras) and the distance between two cameras. The generator creates virtual scenes by randomly selecting objects and textures and following user-defined parameters like the disparity range or image parameters (resolution, color space). Generated data are used only for the learning phase. They are unrealistic but can present concrete challenges for disparity reconstruction such as thin elements and the random assignment of textures to objects to avoid color bias. Our experiments focus on wide-baseline configuration which requires more datasets. We validate the generator by testing the generated datasets with known deep-learning approaches as well as depth reconstruction algorithms in order to validate them. The validation experiments have proven successful.
Subject
Electrical and Electronic Engineering,Computer Graphics and Computer-Aided Design,Computer Vision and Pattern Recognition,Radiology, Nuclear Medicine and imaging
Reference28 articles.
1. Prévost, S., Niquin, C., Chambon, S., and Gales, G. (2013). 3D Video: From Capture to Diffusion, John Wiley & Sons, Ltd.. Chapter 7. 2. Prévoteau, J., Lucas, L., and Rémion, Y. (2013). 3D Video: From Capture to Diffusion, John Wiley & Sons, Ltd.. Chapter 4. 3. Huang, B., Yi, H., Huang, C., He, Y., Liu, J., and Liu, X. (2020). M3VSNet: Unsupervised Multi-metric Multi-view Stereo Network. arXiv. 4. Zhou, C., Zhang, H., Shen, X., and Jia, J. (2017, January 22–29). Unsupervised Learning of Stereo Matching. Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy. 5. Yang, J., Alvarez, J.M., and Liu, M. (2021). Self-Supervised Learning of Depth Inference for Multi-View Stereo. arXiv.
|
|