Affiliation:
1. Education Center of Experiments and Innovations, Harbin Institute of Technology, Shenzhen, 518055, P. R. China
2. School of Computing Sciences and Technology, Institute of Technology, Shenzhen 518055, P. R. China
Abstract
Multimodal hashing aims to efficiently integrate multi-source data into a unified discrete Hamming space, facilitating fast similarity searches with minimal query and storage overhead. Traditional multimodal hashing assumes that data from different sources are fully observed, an assumption that fails in real-world scenarios involving large-scale multimodal data, thereby compromising conventional methods. To address these limitations during both training and retrieval, our approach manages dual-stage data missing, occurring in both phases. In this paper, we introduce a novel framework called Flexible Dual Multimodal Hashing (FDMH), which recovers missing data at both stages by jointly leveraging low-dimensional data relations and semantic graph structural relationships in multi-source data, achieving promising performance in incomplete multimodal retrieval. We transform the original features into anchor graphs and use existing modalities to reconstruct the anchor graphs of missing modalities. Based on these anchor graphs, we perform weight-adaptive fusion in the semantic space, supervised by original semantic labels, and apply a tensor nuclear norm to enforce consistency constraints on the projection matrices across different modalities. Furthermore, our method flexibly fuses existing and recovered modalities during retrieval. We validate the effectiveness of our approach through extensive experiments on four large-scale multimodal datasets, demonstrating its robust performance in real-world dual-missing retrieval scenarios.
Publisher
World Scientific Pub Co Pte Ltd