Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval-Reference-Cited by-同舟云学术

Mining on Heterogeneous Manifolds for Zero-Shot Cross-Modal Image Retrieval

Published:2020-04-03 Issue:07 Volume:34 Page:12589-12596
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Yang Fan,Wang Zheng,Xiao Jing,Satoh Shin'ichi

Abstract

Most recent approaches for the zero-shot cross-modal image retrieval map images from different modalities into a uniform feature space to exploit their relevance by using a pre-trained model. Based on the observation that manifolds of zero-shot images are usually deformed and incomplete, we argue that the manifolds of unseen classes are inevitably distorted during the training of a two-stream model that simply maps images from different modalities into a uniform space. This issue directly leads to poor cross-modal retrieval performance. We propose a bi-directional random walk scheme to mining more reliable relationships between images by traversing heterogeneous manifolds in the feature space of each modality. Our proposed method benefits from intra-modal distributions to alleviate the interference caused by noisy similarities in the cross-modal feature space. As a result, we achieved great improvement in the performance of the thermal v.s. visible image retrieval task. The code of this paper: https://github.com/fyang93/cross-modal-retrieval

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Cross-Modality Transformer with Mixed Data Augmentation Learning for Visible-Infrared Person Re-Identification;2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA);2024-04-25

2. Learning Cross-View Visual Geo-Localization Without Ground Truth;IEEE Transactions on Geoscience and Remote Sensing;2024

3. Striking a Balance: Unsupervised Cross-Domain Crowd Counting via Knowledge Diffusion;Proceedings of the 31st ACM International Conference on Multimedia;2023-10-26

4. Parameter sharing and multi-granularity feature learning for cross-modality person re-identification;Complex & Intelligent Systems;2023-08-14

5. Diverse Embedding Expansion Network and Low-Light Cross-Modality Benchmark for Visible-Infrared Person Re-identification;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06