End-to-end information fusion method for transformer-based stereo matching-Reference-Cited by-同舟云学术

End-to-end information fusion method for transformer-based stereo matching

Published:2024-04-02 Issue:6 Volume:35 Page:065408
ISSN:0957-0233
Container-title:Measurement Science and Technology
language:
Short-container-title:Meas. Sci. Technol.

Author:

Xu Zhenghui^ORCID,Wang Jingxue^ORCID,Guo Jun

Abstract

Abstract In stereo matching, the application of transformers can overcome the limitations of disparity range and capture long-range matching information. However, the lack of cross-epipolar context information often leads to numerous mismatches, especially in low-texture regions. An end-to-end information fusion stereo matching method is proposed to address this issue. In the proposed method, a feature extraction method that combines dense connections and a residual block is proposed. Global and local semantic information can be effectively fused by incorporating dense connections among multiscale feature maps. Additionally, the inclusion of a residual block helps extract more representative feature maps. The idea of criss-cross attention is introduced in the transformer implicit matching process. Criss-cross attention enables the capture of cross-epipolar context information by combining horizontal and vertical attention mechanisms. This method improves the matching accuracy from the perspective of multi-path information fusion. According to the matching results, the disparity regression layer and the context adjustment layer are used to generate the initial and final disparity maps, respectively. The proposed method is evaluated on the Scene Flow, KITTI 2012, and Middlebury 2014 datasets. Experimental results indicate that the proposed method effectively enhances matching accuracy. Moreover, the proposed method exhibits strong generalization ability, allowing for direct application to synthetic, real outdoor, and real indoor scene images.

Funder

Fundamental Applied Research Foundation of Liaoning Province

National Natural Science Foundation of China

Liaoning Revitalization Talents Program

Publisher

IOP Publishing

Link

https://iopscience.iop.org/article/10.1088/1361-6501/ad36d7/pdf

Reference45 articles.

1. SA-Net: scene-aware network for cross-domain stereo matching;Chong;Appl. Intell.,2023

2. Dense feature learning and compact cost aggregation for deep stereo matching;Yin;IEEE Access,2022

3. Exploiting semantic and boundary information for stereo matching;Peng;J. Signal Process. Syst.,2023

4. Integrated image matching and segmentation for 3D surface reconstruction in urban areas;Ye;Photogramm. Eng. Remote Sens.,2018

5. Visionblender: a tool to efficiently generate computer vision datasets for robotic surgery;Cartucho;Comput. Methods Biomech. Biomed. Eng.,2021

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A cascaded GRU-based stereoscopic matching network for precise plank measurement;Measurement Science and Technology;2024-05-30