MambaReID: Exploiting Vision Mamba for Multi-Modal Object Re-Identification-Reference-Cited by-同舟云学术

MambaReID: Exploiting Vision Mamba for Multi-Modal Object Re-Identification

Published:2024-07-17 Issue:14 Volume:24 Page:4639
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Zhang Ruijuan¹²,Xu Lizhong¹,Yang Song¹,Wang Li³

Affiliation:

1. School of Computer and Information, Hohai University, Nanjing 211106, China

2. School of Mathematics and Statistics, Huaiyin Normal University, Huai’an 223300, China

3. School of Computer and Software, Nanjing Vocational University of Industry Technology, Nanjing 210023, China

Abstract

Multi-modal object re-identification (ReID) is a challenging task that seeks to identify objects across different image modalities by leveraging their complementary information. Traditional CNN-based methods are constrained by limited receptive fields, whereas Transformer-based approaches are hindered by high computational demands and a lack of convolutional biases. To overcome these limitations, we propose a novel fusion framework named MambaReID, integrating the strengths of both architectures with the effective VMamba. Specifically, our MambaReID consists of three components: Three-Stage VMamba (TSV), Dense Mamba (DM), and Consistent VMamba Fusion (CVF). TSV efficiently captures global context information and local details with low computational complexity. DM enhances feature discriminability by fully integrating inter-modality information with shallow and deep features through dense connections. Additionally, with well-aligned multi-modal images, CVF provides more granular modal aggregation, thereby improving feature robustness. The MambaReID framework, with its innovative components, not only achieves superior performance in multi-modal object ReID tasks, but also does so with fewer parameters and lower computational costs. Our proposed MambaReID’s effectiveness is validated by extensive experiments conducted on three multi-modal object ReID benchmarks.

Funder

National Science Foundation of Jiang Su Higher Education Institutions

Publisher

MDPI AG

Link

https://www.mdpi.com/1424-8220/24/14/4639/pdf

Reference48 articles.

1. Deep learning for person re-identification: A survey and outlook;Ye;TPAMI,2021

2. Ye, M., Chen, S., Li, C., Zheng, W., Crandall, D., and Du, B. (2024). Transformer for Object Re-Identification: A Survey. arXiv.

3. Amiri, A., Kaya, A., and Keceli, A. (2024). A Comprehensive Survey on Deep-Learning-based Vehicle Re-Identification: Models, Data Sets and Challenges. arXiv.

4. Robust multi-modality person re-identification;Zheng;Proc. AAAI Conf. Artif. Intell.,2021

5. Multi-spectral vehicle re-identification: A challenge;Li;Proc. AAAI Conf. Artif. Intell.,2020