UnVELO: Unsupervised Vision-Enhanced LiDAR Odometry with Online Correction-Reference-Cited by-同舟云学术

UnVELO: Unsupervised Vision-Enhanced LiDAR Odometry with Online Correction

Published:2023-04-13 Issue:8 Volume:23 Page:3967
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Li Bin¹^ORCID,Ye Haifeng¹^ORCID,Fu Sihan¹^ORCID,Gong Xiaojin¹^ORCID,Xiang Zhiyu¹

Affiliation:

1. Faculty of the College of Information Science and Electronic Engineering, Zhejiang University, Hangzhou 310027, China

Abstract

Due to the complementary characteristics of visual and LiDAR information, these two modalities have been fused to facilitate many vision tasks. However, current studies of learning-based odometries mainly focus on either the visual or LiDAR modality, leaving visual–LiDAR odometries (VLOs) under-explored. This work proposes a new method to implement an unsupervised VLO, which adopts a LiDAR-dominant scheme to fuse the two modalities. We, therefore, refer to it as unsupervised vision-enhanced LiDAR odometry (UnVELO). It converts 3D LiDAR points into a dense vertex map via spherical projection and generates a vertex color map by colorizing each vertex with visual information. Further, a point-to-plane distance-based geometric loss and a photometric-error-based visual loss are, respectively, placed on locally planar regions and cluttered regions. Last, but not least, we designed an online pose-correction module to refine the pose predicted by the trained UnVELO during test time. In contrast to the vision-dominant fusion scheme adopted in most previous VLOs, our LiDAR-dominant method adopts the dense representations for both modalities, which facilitates the visual–LiDAR fusion. Besides, our method uses the accurate LiDAR measurements instead of the predicted noisy dense depth maps, which significantly improves the robustness to illumination variations, as well as the efficiency of the online pose correction. The experiments on the KITTI and DSEC datasets showed that our method outperformed previous two-frame-based learning methods. It was also competitive with hybrid methods that integrate a global optimization on multiple or all frames.

Funder

Primary Research and Development Plan of Zhejiang Province

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/23/8/3967/pdf

Reference68 articles.

1. Li, Y., Yu, A.W., Meng, T., Caine, B., Ngiam, J., Peng, D., Shen, J., Lu, Y., Zhou, D., and Le, Q.V. (2022, January 19–24). DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection. Proceedings of the CVPR, New Orleans, LA, USA.

2. Vora, S., Lang, A.H., Helou, B., and Beijbom, O. (2020, January 13–19). PointPainting: Sequential Fusion for 3D Object Detection. Proceedings of the CVPR, Seattle, WA, USA.

3. Pang, S., Morris, D., and Radha, H. (2022, January 4–8). Fast-CLOCs: Fast Camera-LiDAR Object Candidates Fusion for 3D Object Detection. Proceedings of the WACV, Waikoloa, HI, USA.

4. Ma, F., and Karaman, S. (2018, January 21–25). Sparse-to-dense: Depth prediction from sparse depth samples and a single image. Proceedings of the ICRA, Brisbane, QLD, Australia.

5. Hua, J., and Gong, X. (2018, January 13–19). A normalized convolutional neural network for guided sparse depth upsampling. Proceedings of the IJCAI, Stockholm, Sweden.