Multi-Scale Cost Attention and Adaptive Fusion Stereo Matching Network
-
Published:2023-03-28
Issue:7
Volume:12
Page:1594
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Liu Zhenguo1, Li Zhao1ORCID, Ao Wengang2, Zhang Shaoshuang1, Liu Wenlong1, He Yizhi1
Affiliation:
1. College of Computer Science and Technology, Shandong University of Technology, Zibo 255000, China 2. School of Mechanical Engineering, Chongqing Technology and Business University, Chongqing 400000, China
Abstract
At present, compared to 3D convolution, 2D convolution is less computationally expensive and faster in stereo matching methods based on convolution. However, compared to the initial cost volume generated by calculation using a 3D convolution method, the initial cost volume generated by 2D convolution in the relevant layer lacks rich information, resulting in the area affected by illumination in the disparity map having a lower robustness and thus affecting its accuracy. Therefore, to address the lack of rich cost volume information in the 2D convolution method, this paper proposes a multi-scale adaptive cost attention and adaptive fusion stereo matching network (MCAFNet) based on AANet+. Firstly, the extracted features are used for initial cost calculation, and the cost volume is input into the multi-scale adaptive cost attention module to generate attention weight, which is then combined with the initial cost volume to suppress irrelevant information and enrich the cost volume. Secondly, the cost aggregation part of the model is improved. A multi-scale adaptive fusion module is added to improve the fusion efficiency of cross-scale cost aggregation. In the Scene Flow dataset, the EPE is reduced to 0.66. The error matching rates in the KITTI2012 and KITTI2015 datasets are 1.60% and 2.22%, respectively.
Funder
National Key R&D Program of China
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference43 articles.
1. Xu, H.F., and Zhang, J.Y. (2020, January 13–19). AA-Net: Adaptive Aggregation Network for Efficient Stereo Matching. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA. 2. Zhu, Z., He, M., Dai, Y., Rao, Z., and Li, B. (2019, January 19–21). Multi-scale cross-form pyramid network for stereo matching. Proceedings of the 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA), Xi’an, China. 3. Kendall, A., Martirosyan, H., Dasgupta, S., Henry, P., Kennedy, R., Bachrach, A., and Bry, A. (2017, January 22–29). End-to-end learning of geometry and context for deep stereo regression. Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy. 4. Wu, Z., Wu, X., Zhang, X., Wang, S., and Ju, L. (November, January 27). Semantic stereo matching with pyramid cost volumes. Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea. 5. Shen, Z., Dai, Y., and Rao, Z. (2020). Msmd-net: Deep stereo matching with multi-scale and multi-dimension cost volume. arXiv.
|
|