Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering-Reference-Cited by-同舟云学术

Efficient Multi-Scale Stereo-Matching Network Using Adaptive Cost Volume Filtering

Published:2022-07-23 Issue:15 Volume:22 Page:5500
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Jeon Suyeon,Heo Yong Seok^ORCID

Abstract

While recent deep learning-based stereo-matching networks have shown outstanding advances, there are still some unsolved challenges. First, most state-of-the-art stereo models employ 3D convolutions for 4D cost volume aggregation, which limit the deployment of networks for resource-limited mobile environments owing to heavy consumption of computation and memory. Although there are some efficient networks, most of them still require a heavy computational cost to incorporate them to mobile computing devices in real-time. Second, most stereo networks indirectly supervise cost volumes through disparity regression loss by using the softargmax function. This causes problems in ambiguous regions, such as the boundaries of objects, because there are many possibilities for unreasonable cost distributions which result in overfitting problem. A few works deal with this problem by generating artificial cost distribution using only the ground truth disparity value that is insufficient to fully regularize the cost volume. To address these problems, we first propose an efficient multi-scale sequential feature fusion network (MSFFNet). Specifically, we connect multi-scale SFF modules in parallel with a cross-scale fusion function to generate a set of cost volumes with different scales. These cost volumes are then effectively combined using the proposed interlaced concatenation method. Second, we propose an adaptive cost-volume-filtering (ACVF) loss function that directly supervises our estimated cost volume. The proposed ACVF loss directly adds constraints to the cost volume using the probability distribution generated from the ground truth disparity map and that estimated from the teacher network which achieves higher accuracy. Results of several experiments using representative datasets for stereo matching show that our proposed method is more efficient than previous methods. Our network architecture consumes fewer parameters and generates reasonable disparity maps with faster speed compared with the existing state-of-the art stereo models. Concretely, our network achieves 1.01 EPE with runtime of 42 ms, 2.92 M parameters, and 97.96 G FLOPs on the Scene Flow test set. Compared with PSMNet, our method is 89% faster and 7% more accurate with 45% fewer parameters.

Funder

BK21 FOUR program of the National Research Foundation of Korea

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/22/15/5500/pdf

Reference39 articles.

1. Stereo matching algorithm for autonomous positioning of underground mine robots;Huang;Proceedings of the International Conference on Robots & Intelligent System,2018

2. Dense stereo matching with application to augmented reality;Zenati;Proceedings of the IEEE International Conference on Signal Processing and Communications,2007

3. Distance estimation in virtual reality and augmented reality: A survey;El Jamiy;Proceedings of the IEEE International Conference on Electro Information Technology,2019

4. Deepdriving: Learning affordance for direct perception in autonomous driving;Chen;Proceedings of the IEEE International Conference on Computer Vision,2015

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Methods for volume inference of non-medical objects from images: A short review;Journal of Ambient Intelligence and Smart Environments;2024-01-17

2. An application of stereo matching algorithm based on transfer learning on robots in multiple scenes;Scientific Reports;2023-08-06

3. Multi-Scale Cost Attention and Adaptive Fusion Stereo Matching Network;Electronics;2023-03-28

4. Robust Estimation and Optimized Transmission of 3D Feature Points for Computer Vision on Mobile Communication Network;Sensors;2022-11-07