DSC-MVSNet: attention aware cost volume regularization based on depthwise separable convolution for multi-view stereo-Reference-Cited by-同舟云学术

DSC-MVSNet: attention aware cost volume regularization based on depthwise separable convolution for multi-view stereo

Published:2023-06-07 Issue:6 Volume:9 Page:6953-6969
ISSN:2199-4536
Container-title:Complex & Intelligent Systems
language:en
Short-container-title:Complex Intell. Syst.

Author:

Zhang Song^ORCID,Wei Zhiwei^ORCID,Xu Wenjia^ORCID,Zhang Lili,Wang Yang,Zhou Xin,Liu Junyi

Abstract

AbstractDeep learning has recently been proven to deliver excellent performance in multi-view stereo (MVS). However, it is difficult for deep learning-based MVS approaches to balance their efficiency and effectiveness. Towards this end, we propose the DSC-MVSNet, a novel coarse-to-fine and end-to-end framework for more efficient and more accurate depth estimation in MVS. In particular, we propose an attention aware 3D UNet-shape network, which first uses the depthwise separable convolutions for cost volume regularization. This mechanism enables effective aggregation of information and significantly reduces the model parameters and computation by transforming the ordinary convolution on cost volume as depthwise convolution and pointwise convolution. Besides, a 3D-Attention module is proposed to alleviate the feature mismatching problem in cost volume regularization and aggregate the important information of cost volume in three dimensions (i.e. channel, space, and depth). Moreover, we propose an efficient Feature Transfer Module to upsample the low-resolution (LR) depth map to a high-resolution (HR) depth map to achieve higher accuracy. With extensive experiments on two benchmark datasets, i.e. DTU and Tanks & Temples, we demonstrate that the parameters of our model are significantly reduced to

$$25\%$$

25 % of the state-of-the-art model MVSNet. Besides, our method outperforms or maintains on par accuracy with the state-of-the-art models. Our source code is available at https://github.com/zs670980918/DSC-MVSNet.

Funder

Youth Innovation Promotion Association

Publisher

Springer Science and Business Media LLC

Subject

Computational Mathematics,Engineering (miscellaneous),Information Systems,Artificial Intelligence

Link

https://link.springer.com/content/pdf/10.1007/s40747-023-01106-3.pdf

Reference63 articles.

1. Aanæs H, Jensen RR, Vogiatzis G, Tola E, Dahl AB (2016) Large-scale data for multiple-view stereopsis. Int J Comput Vision 120:153–168

2. Furukawa Y, Hernández C (2015) Multi-view stereo: a tutorial. Found Trends Comput Graph Vis 9:1–148

3. Kim H, Guillemaut J-Y, Takai T, Sarim M, Hilton A (2012) Outdoor dynamic 3-d scene reconstruction. IEEE Trans Circuits Syst Video Technol 22(11):1611–1622. https://doi.org/10.1109/TCSVT.2012.2202185