Affiliation:
1. Computer Engineering College Jimei University Xiamen China
Abstract
AbstractStereo matching is a fundamental and long‐standing task in computer vision. Although learning‐based stereo matching algorithms have made remarkable progress, two major challenges still persist. Firstly, existing cost aggregation methods that use stacked three‐dimensional convolutions are complex, leading to heavy computation and memory costs. Secondly these methods continue to struggle with establishing reliable matches in weakly matchable such as that edges and thin structures. To overcome these limitations, we propose an accurate and efficient network called Attention‐guided Aggregation and Error‐aware Enhancement Network (AAEE‐Net). Our approach involves designing an Attention‐guided Aggregation Mechanism (AAM) based on simple image features. This mechanism uses attention weights generated from image features to guide cost aggregation with a more efficient and effective strategy. Additionally, we propose an Error‐aware Enhancement Module (EEM) that refines the raw disparity by combining high‐frequency information from the original image and warp error between the left and right views. EEM enables the network to learn error correction capabilities that produce excellent subtle details and sharp edges. The experimental results on the SceneFlow and KITTI benchmark datasets demonstrate that AAEE‐Net achieves state‐of‐the‐art performance with low inference time. The qualitative results show that AAEE‐Net significantly improves predictions, especially for thin structures.
Funder
National Natural Science Foundation of China
Subject
Computational Theory and Mathematics,Computer Networks and Communications,Computer Science Applications,Theoretical Computer Science,Software