Affiliation:
1. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China
Abstract
Deep learning techniques are increasingly applied to point cloud semantic segmentation, where single-modal point cloud often suffers from accuracy-limiting confusion phenomena. Moreover, some networks with image and LiDAR data lack an efficient fusion mechanism, and the occlusion of images may do harm to the segmentation accuracy of a point cloud. To overcome the above issues, we propose the integration of multi-modal data to enhance network performance, addressing the shortcomings of existing feature-fusion strategies that neglect crucial information and struggle with matching modal features effectively. This paper introduces the Multi-View Guided Point Cloud Semantic Segmentation Model (MVG-Net), which extracts multi-scale and multi-level features and contextual data from urban aerial images and LiDAR, and then employs a multi-view image feature-aggregation module to capture highly correlated texture information with the spatial and channel attentions of point-wise image features. Additionally, it incorporates a fusion module that uses image features to instruct point cloud features for stressing key information. We present a new dataset, WK2020, which combines multi-view oblique aerial images with LiDAR point cloud to validate segmentation efficacy. Our method demonstrates superior performance, especially in building segmentation, achieving an F1 score of 94.6% on the Vaihingen Dataset—the highest among the methods evaluated. Furthermore, MVG-Net surpasses other networks tested on the WK2020 Dataset. Compared to backbone network for single point modality, our model achieves overall accuracy improvement of 5.08%, average F1 score advancement of 6.87%, and mean Intersection over Union (mIoU) betterment of 7.9%.
Funder
National Key Research and Development Program of China
Reference51 articles.
1. Rusu, R.B., and Cousins, S. (2011, January 9–13). 3D is here: Point cloud library (pcl). Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China.
2. Smart point cloud: Definition and remaining challenges;Poux;ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci.,2016
3. A bayesian-network-based classification method integrating airborne lidar data with optical images;Kang;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2016
4. A multi-scale fully convolutional network for semantic labeling of 3D point clouds;Yousefhussien;ISPRS J. Photogramm. Remote Sens.,2018
5. A Multiscale Convolutional Neural Network With Color Vegetation Indices for Semantic Labeling of Point Cloud;Zhang;IEEE Geosci. Remote Sens. Lett.,2021