HARR: Learning Discriminative and High-Quality Hash Codes for Image Retrieval
-
Published:2024-01-22
Issue:5
Volume:20
Page:1-23
-
ISSN:1551-6857
-
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
-
language:en
-
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.
Author:
Ma Zeyu1ORCID,
Wang Siwei2ORCID,
Luo Xiao3ORCID,
Gu Zhonghui4ORCID,
Chen Chong5ORCID,
Li Jinxing1ORCID,
Hua Xian-Sheng5ORCID,
Lu Guangming1ORCID
Affiliation:
1. School of Computer Science and Technology, Harbin Institute of Technology, China
2. College of Finance and Statistics, Hunan University, China
3. Department of Computer Science, University of California, Los Angeles, USA
4. Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, China
5. Terminus Group, China
Abstract
This article studies deep unsupervised hashing, which has attracted increasing attention in large-scale image retrieval. The majority of recent approaches usually reconstruct semantic similarity information, which then guides the hash code learning. However, they still fail to achieve satisfactory performance in reality for two reasons. On the one hand, without accurate supervised information, these methods usually fail to produce independent and robust hash codes with semantics information well preserved, which may hinder effective image retrieval. On the other hand, due to discrete constraints, how to effectively optimize the hashing network in an end-to-end manner with small quantization errors remains a problem. To address these difficulties, we propose a novel unsupervised hashing method called
HARR
to learn discriminative and high-quality hash codes. To comprehensively explore semantic similarity structure, HARR adopts the Winner-Take-All hash to model the similarity structure. Then similarity-preserving hash codes are learned under the reliable guidance of the reconstructed similarity structure. Additionally, we improve the quality of hash codes by a bit correlation reduction module, which forces the cross-correlation matrix between a batch of hash codes under different augmentations to approach the identity matrix. In this way, the generated hash bits are expected to be invariant to disturbances with minimal redundancy, which can be further interpreted as an instantiation of the information bottleneck principle. Finally, for effective hashing network training, we minimize the cosine distances between real-value network outputs and their binary codes for small quantization errors. Extensive experiments demonstrate the effectiveness of our proposed HARR.
Funder
NSFC
Shenzhen Key Technical Project
Guangdong International Science and Technology Cooperation Project
Shenzhen Fundamental Research Fund
Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies
National Natural Science Foundation of China
Shenzhen Colleges and Universities Stable Support Program
Shenzhen Science and Technology Program
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Networks and Communications,Hardware and Architecture
Reference82 articles.
1. Learning representations for neural network-based classification using the information bottleneck principle;Amjad Rana Ali;IEEE Transactions on Pattern Analysis and Machine Intelligence,2019
2. Loopy residual hashing: Filling the quantization gap for image retrieval;Bai Jiale;IEEE Transactions on Multimedia,2019
3. Optimal rates of convergence for covariance matrix estimation;Cai T. Tony;Annals of Statistics,2010
4. Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2017. HashNet: Deep learning to hash by continuation. In Proceedings of the IEEE/CVF International Conference on Computer Vision.
5. Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the Conference on Neural Information Processing Systems.