Hybrid DAER Based Cross-modal Retrieval Exploiting Deep Representation Learning-Reference-Cited by-同舟云学术

Hybrid DAER Based Cross-modal Retrieval Exploiting Deep Representation Learning

Published:2023-02-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Huang Zhao¹,Hu Haowu¹

Affiliation:

1. ShaanXi Normal University

Abstract

Abstract Information retrieval across multi-modal has attracted much attention from academics and practitioners. One key challenge of cross-modal retrieval is to eliminate the heterogeneous gap between different patterns. Most of the existing methods tend to jointly construct a common subspace. However, very little attention has been given to the study of the importance of different fine-grained regions of various modalities. This lack of considerations significantly influences the utilization of the extracted information of multiple modalities. Therefore, this study proposes a novel text-image cross-modal retrieval approach that constructs the dual attention network and the enhanced relation network (DAER). More specifically, the dual attention network tends to precisely extract fine-grained weight information from text and images, while the enhanced relation network is used to expand the differences between different categories of data in order to improve the computational accuracy of similarity. The comprehensive experimental results on three widely-used major datasets (i.e. Wikipedia, Pascal Sentence, and XMediaNet) show that our proposed approach is effective and superior to existing cross-modal retrieval methods.

Publisher

Research Square Platform LLC

Reference48 articles.

1. Image-text bidirectional learning network based cross-modal retrieval;Li Z;Neurocomputing,2022

2. Nagrani, A., Albanie, S., Zisserman, A. (2018). Seeing voices and hearing faces: cross-modal biometric matching. In Proceedings of the IEEE conference on computer vision and pattern recognition. 8427–8436.

3. A natural language interface to a graph-based bibliographic information retrieval system;Yongjun Z;Data & Knowledge Engineering,2017

4. Modality-specific cross-modal similarity measurement with recurrent attention network;Yuxin P;IEEE Transactions on Image Processing,2018

5. A new fuzzy logic based ranking function for efficient information retrieval system;Gupta Y;Expert Systems with Applications.,2015