DLALoc: Deep-Learning Accelerated Visual Localization Based on Mesh Representation-Reference-Cited by-同舟云学术

DLALoc: Deep-Learning Accelerated Visual Localization Based on Mesh Representation

Published:2023-01-13 Issue:2 Volume:13 Page:1076
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Zhang Peng^ORCID,Liu Wenfen

Abstract

Visual localization, i.e., the camera pose localization within a known three-dimensional (3D) model, is a basic component for numerous applications such as autonomous driving cars and augmented reality systems. The most widely used methods from the literature are based on local feature matching between a query image that needs to be localized and database images with known camera poses and local features. However, this method still struggles with different illumination conditions and seasonal changes. Additionally, the scene is normally presented by a sparse structure-from-motion point cloud that has corresponding local features to match. This scene representation depends heavily on different local feature types, and changing the different local feature types requires an expensive feature-matching step to generate the 3D model. Moreover, the state-of-the-art matching strategies are too resource intensive for some real-time applications. Therefore, in this paper, we introduce a novel framework called deep-learning accelerated visual localization (DLALoc) based on mesh representation. In detail, we employ a dense 3D model, i.e., mesh, to represent a scene that can provide more robust 2D-3D matches than 3D point clouds and database images. We can obtain their corresponding 3D points from the depth map rendered from the mesh. Under this scene representation, we use a pretrained multilayer perceptron combined with homotopy continuation to calculate the relative pose of the query and database images. We also use the scale consistency of 2D-3D matches to perform the efficient random sample consensus to find the best 2D inlier set for the subsequential perspective-n-point localization step. Furthermore, we evaluate the proposed visual localization pipeline experimentally on Aachen DayNight v1.1 and RobotCar Seasons datasets. The results show that the proposed approach can achieve state-of-the-art accuracy and shorten the localization time about five times.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/13/2/1076/pdf

Reference43 articles.

1. Liu, D., Cui, Y., Guo, X., Ding, W., Yang, B., and Chen, Y. (2021, January 10–15). Visual localization for autonomous driving: Mapping the accurate location in the city maze. Proceedings of the 2020 25th International Conference on Pattern Recognition (ICPR), Milan, Italy.

2. Bürki, M., Schaupp, L., Dymczyk, M., Dubé, R., Cadena, C., Siegwart, R., and Nieto, J. (2019, January 9–12). Vizard: Reliable visual localization for autonomous vehicles in urban outdoor environments. Proceedings of the 2019 IEEE Intelligent Vehicles Symposium (IV), Paris, France.

3. Amato, G., Cardillo, F.A., and Falchi, F. (2017). Sensing the Past, Springer.

4. Middelberg, S., Sattler, T., Untzelmann, O., and Kobbelt, L. (2014, January 6–12). Scalable 6-dof localization on mobile devices. Proceedings of the European Conference on Computer Vision, Zurich, Switzerland.

5. Sarlin, P.E., Cadena, C., Siegwart, R., and Dymczyk, M. (2019, January 16–20). From coarse to fine: Robust hierarchical localization at large scale. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA.