Affiliation:
1. Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
2. University of Chinese Academy of Sciences, Beijing 100049, China
Abstract
With the development of remote sensing satellite technology for Earth observation, remote sensing stereo images have been used for three-dimensional reconstruction in various fields, such as urban planning and construction. However, remote sensing images often contain noise, occluded regions, untextured areas, and repeated textures, which can lead to reduced accuracy in stereo matching and affect the quality of 3D reconstruction results. To reduce the impact of complex scenes in remote sensing images on stereo matching and to ensure both speed and accuracy, we propose a new end-to-end stereo matching network based on convolutional neural networks (CNNs). The proposed stereo matching network can learn features at different scales from the original images and construct cost volumes with varying scales to obtain richer scale information. Additionally, when constructing the cost volume, we introduce negative disparity to adapt to the common occurrence of both negative and non-negative disparities in remote sensing stereo image pairs. For cost aggregation, we employ a 3D convolution-based encoder–decoder structure that allows the network to adaptively aggregate information. Before feature aggregation, we also introduce an attention module to retain more valuable feature information, enhance feature representation, and obtain a higher-quality disparity map. By training on the publicly available US3D dataset, we obtain an accuracy of 1.115 pixels in end-point error (EPE) and 5.32% in the error pixel ratio (D1) on the test dataset, and the inference speed is 92 ms. Comparing our model with existing state-of-the-art models, we achieve higher accuracy, and the network is beneficial for the three-dimensional reconstruction of remote sensing images.
Funder
National Natural Science Foundation of China
National Key Research and Development Program of China
Major Science and Technology Projects of Yunnan Province
Subject
General Earth and Planetary Sciences
Reference44 articles.
1. Niu, J., Song, R., and Li, Y. (2006, January 20–23). A Stereo Matching Method Based on Kernel Density Estimation. Proceedings of the 2006 IEEE International Conference on Information Acquisition, Veihai, China.
2. Sonka, M., Hlavac, V., and Boyle, R. (2013). Image Processing, Analysis and Machine Vision, Springer.
3. Suliman, A., Zhang, Y., and Al-Tahir, R. (2016, January 10–15). Enhanced disparity maps from multi-view satellite images. Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China.
4. Scharstein, D. (2001, January 9–10). A taxonomy and evaluation of dense two-frame stereo correspondence. Proceedings of the IEEE Workshop on Stereo and Multi-Baseline Vision, Kauai, HI, USA.
5. Zabih, R., and Woodfill, J. (1994, January 2–6). Non-parametric local transforms for computing visual correspondence. Proceedings of the Computer Vision—ECCV’94: Third European Conference on Computer Vision, Stockholm, Sweden.