MVG-Net: LiDAR Point Cloud Semantic Segmentation Network Integrating Multi-View Images-Reference-Cited by-同舟云学术

MVG-Net: LiDAR Point Cloud Semantic Segmentation Network Integrating Multi-View Images

Published:2024-07-31 Issue:15 Volume:16 Page:2821
ISSN:2072-4292
Container-title:Remote Sensing
language:en
Short-container-title:Remote Sensing

Author:

Liu Yongchang¹,Liu Yawen¹,Duan Yansong¹^ORCID

Affiliation:

1. School of Remote Sensing and Information Engineering, Wuhan University, Wuhan 430079, China

Abstract

Deep learning techniques are increasingly applied to point cloud semantic segmentation, where single-modal point cloud often suffers from accuracy-limiting confusion phenomena. Moreover, some networks with image and LiDAR data lack an efficient fusion mechanism, and the occlusion of images may do harm to the segmentation accuracy of a point cloud. To overcome the above issues, we propose the integration of multi-modal data to enhance network performance, addressing the shortcomings of existing feature-fusion strategies that neglect crucial information and struggle with matching modal features effectively. This paper introduces the Multi-View Guided Point Cloud Semantic Segmentation Model (MVG-Net), which extracts multi-scale and multi-level features and contextual data from urban aerial images and LiDAR, and then employs a multi-view image feature-aggregation module to capture highly correlated texture information with the spatial and channel attentions of point-wise image features. Additionally, it incorporates a fusion module that uses image features to instruct point cloud features for stressing key information. We present a new dataset, WK2020, which combines multi-view oblique aerial images with LiDAR point cloud to validate segmentation efficacy. Our method demonstrates superior performance, especially in building segmentation, achieving an F1 score of 94.6% on the Vaihingen Dataset—the highest among the methods evaluated. Furthermore, MVG-Net surpasses other networks tested on the WK2020 Dataset. Compared to backbone network for single point modality, our model achieves overall accuracy improvement of 5.08%, average F1 score advancement of 6.87%, and mean Intersection over Union (mIoU) betterment of 7.9%.

Funder

National Key Research and Development Program of China

Publisher

MDPI AG

Link

https://www.mdpi.com/2072-4292/16/15/2821/pdf

Reference51 articles.

1. Rusu, R.B., and Cousins, S. (2011, January 9–13). 3D is here: Point cloud library (pcl). Proceedings of the IEEE International Conference on Robotics and Automation, Shanghai, China.

2. Smart point cloud: Definition and remaining challenges;Poux;ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci.,2016

3. A bayesian-network-based classification method integrating airborne lidar data with optical images;Kang;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2016

4. A multi-scale fully convolutional network for semantic labeling of 3D point clouds;Yousefhussien;ISPRS J. Photogramm. Remote Sens.,2018

5. A Multiscale Convolutional Neural Network With Color Vegetation Indices for Semantic Labeling of Point Cloud;Zhang;IEEE Geosci. Remote Sens. Lett.,2021