Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing-Reference-Cited by-同舟云学术

Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing

Published:2022-07-28 Issue:15 Volume:10 Page:2644
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Yang Xiaohan,Wang Zhen^ORCID,Wu Nannan,Li Guokun,Feng Chuang,Liu Pingping^ORCID

Abstract

The image-text cross-modal retrieval task, which aims to retrieve the relevant image from text and vice versa, is now attracting widespread attention. To quickly respond to the large-scale task, we propose an Unsupervised Deep Relative Neighbor Relationship Preserving Cross-Modal Hashing (DRNPH) to achieve cross-modal retrieval in the common Hamming space, which has the advantages of storage and efficiency. To fulfill the nearest neighbor search in the Hamming space, we demand to reconstruct both the original intra- and inter-modal neighbor matrix according to the binary feature vectors. Thus, we can compute the neighbor relationship among different modal samples directly based on the Hamming distances. Furthermore, the cross-modal pair-wise similarity preserving constraint requires the similar sample pair have an identical Hamming distance to the anchor. Therefore, the similar sample pairs own the same binary code, and they have minimal Hamming distances. Unfortunately, the pair-wise similarity preserving constraint may lead to an imbalanced code problem. Therefore, we propose the cross-modal triplet relative similarity preserving constraint, which demands the Hamming distances of similar pairs should be less than those of dissimilar pairs to distinguish the samples’ ranking orders in the retrieval results. Moreover, a large similarity marginal can boost the algorithm’s noise robustness. We conduct the cross-modal retrieval comparative experiments and ablation study on two public datasets, MIRFlickr and NUS-WIDE, respectively. The experimental results show that DRNPH outperforms the state-of-the-art approaches in various image-text retrieval scenarios, and all three proposed constraints are necessary and effective for boosting cross-modal retrieval performance.

Funder

National Natural Science Foundation of China

the Natural Science Foundation of Shandong Province of China

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/10/15/2644/pdf

Reference51 articles.

1. Deep Supervised Cross-Modal Retrieval;Zhen;Proceedings of the Computer Vision and Pattern Recognition,2019

2. Discriminative Supervised Hashing for Cross-Modal Similarity Search

3. Multimedia content processing through cross-modal association;Li;Proceedings of the International Conference on Multimedia,2003

4. A new approach to cross-modal multimedia retrieval;Rasiwasia;Proceedings of the International Conference on Multimedia, ACM,2010

5. Multi-view discriminant analysis;Kan;Proceedings of the European Conference on Computer Vision,2012

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-Grained Similarity Preserving and Updating for Unsupervised Cross-Modal Hashing;Applied Sciences;2024-01-19