Abstract
The visual SLAM system requires precise localization. To obtain consistent feature matching results, visual features acquired by neural networks are being increasingly used to replace traditional manual features in situations with weak texture, motion blur, or repeated patterns. However, to improve the level of accuracy, most deep learning enhanced SLAM systems, which have a decreased efficiency. In this paper, we propose Coarse TRVO, a visual odometry system that uses deep learning for feature matching. The deep learning network uses a CNN and transformer structures to provide dense high-quality end-to-end matches for a pair of images, even under indistinctive settings with low-texture regions or repeating patterns occupying the majority of the field of view. Meanwhile, we made the proposed model compatible with NVIDIA TensorRT runtime to boost the performance of the algorithm. After obtaining the matching point pairs, the camera pose is solved in an optimized way by minimizing the re-projection error of the feature points. Experiments based on multiple data sets and real environments show that Coarse TRVO achieves a higher robustness and relative positioning accuracy in comparison with the current mainstream visual SLAM system.
Funder
Natural Science Foundation of Beijing Municipality
National Science Foundation
National Key Research and Development Program of China
Aeronautical Science Foundation of China
Publisher
Fuji Technology Press Ltd.
Subject
Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction
Reference25 articles.
1. D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int. J. of Computer Vision, Vol.60, pp. 91-110, 2004.
2. H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded-Up Robust Features (SURF),” Computer Vision and Image Understanding, Vol.110, Issue 3, pp. 346-359, 2008.
3. E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” 2011 Int. Conf. on Computer Vision, pp. 2564-2571 2011.
4. R. Kang, J. Shi, X. Li, Y. Liu, and X. Liu, “DF-SLAM: A deep-learning enhanced visual SLAM system based on deep local features,” arXiv:1901.07223, 2019.
5. J. Sun, Z. Shen, Y. Wang, H. Bao, and X. Zhou, “LoFTR: Detector-free local feature matching with transformers,” Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR), pp. 8922-8931, 2021.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献