Contextual Patch-NetVLAD: Context-Aware Patch Feature Descriptor and Patch Matching Mechanism for Visual Place Recognition
Author:
Sun Wenyuan1, Chen Wentang234ORCID, Huang Runxiang1, Tian Jing1ORCID
Affiliation:
1. Institute of Systems Science, National University of Singapore, Singapore 119615, Singapore 2. State Key Laboratory of Fluid Power and Mechatronic Systems, School of Mechanical Engineering, Zhejiang University, Hangzhou 310027, China 3. Engineering Research Center for Design Engineering and Digital Twin of Zhejiang Province, School of Mechanical Engineering, Zhejiang University, Hangzhou 310027, China 4. Robotics Institute, Zhejiang University, Hangzhou 310027, China
Abstract
The goal of visual place recognition (VPR) is to determine the location of a query image by identifying its place in a collection of image databases. Visual sensor technologies are crucial for visual place recognition as they allow for precise identification and location of query images within a database. Global descriptor-based VPR methods face the challenge of accurately capturing the local specific regions within a scene; consequently, it leads to an increasing probability of confusion during localization in such scenarios. To tackle feature extraction and feature matching challenges in VPR, we propose a modified patch-NetVLAD strategy that includes two new modules: a context-aware patch descriptor and a context-aware patch matching mechanism. Firstly, we propose a context-driven patch feature descriptor to overcome the limitations of global and local descriptors in visual place recognition. This descriptor aggregates features from each patch’s surrounding neighborhood. Secondly, we introduce a context-driven feature matching mechanism that utilizes cluster and saliency context-driven weighting rules to assign higher weights to patches that are less similar to densely populated or locally similar regions for improved localization performance. We further incorporate both of these modules into the patch-NetVLAD framework, resulting in a new approach called contextual patch-NetVLAD. Experimental results are provided to show that our proposed approach outperforms other state-of-the-art methods to achieve a Recall@10 score of 99.82 on Pittsburgh30k, 99.82 on FMDataset, and 97.68 on our benchmark dataset.
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference42 articles.
1. Visual place recognition: A survey from deep learning perspective;Zhang;Pattern Recognit.,2021 2. A Survey on Deep Visual Place Recognition;Masone;IEEE Access,2021 3. Barros, T., Pereira, R., Garrote, L., Premebida, C., and Nunes, U.J. (2021). Place recognition survey: An update on deep learning approaches. arXiv. 4. Schubert, S., Neubert, P., Garg, S., Milford, M., and Fischer, T. (2023). Visual Place Recognition: A Tutorial. IEEE Robot. Autom. Mag., 2–16. 5. Berton, G., Mereu, R., Trivigno, G., Masone, C., Csurka, G., Sattler, T., and Caputo, B. (2022, January 18–24). Deep Visual Geo-localization Benchmark. Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA.
|
|