HARR: Learning Discriminative and High-Quality Hash Codes for Image Retrieval-Reference-Cited by-同舟云学术

HARR: Learning Discriminative and High-Quality Hash Codes for Image Retrieval

Published:2024-01-22 Issue:5 Volume:20 Page:1-23
ISSN:1551-6857
Container-title:ACM Transactions on Multimedia Computing, Communications, and Applications
language:en
Short-container-title:ACM Trans. Multimedia Comput. Commun. Appl.

Author:

Ma Zeyu¹^ORCID,Wang Siwei²^ORCID,Luo Xiao³^ORCID,Gu Zhonghui⁴^ORCID,Chen Chong⁵^ORCID,Li Jinxing¹^ORCID,Hua Xian-Sheng⁵^ORCID,Lu Guangming¹^ORCID

Affiliation:

1. School of Computer Science and Technology, Harbin Institute of Technology, China

2. College of Finance and Statistics, Hunan University, China

3. Department of Computer Science, University of California, Los Angeles, USA

4. Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, China

5. Terminus Group, China

Abstract

This article studies deep unsupervised hashing, which has attracted increasing attention in large-scale image retrieval. The majority of recent approaches usually reconstruct semantic similarity information, which then guides the hash code learning. However, they still fail to achieve satisfactory performance in reality for two reasons. On the one hand, without accurate supervised information, these methods usually fail to produce independent and robust hash codes with semantics information well preserved, which may hinder effective image retrieval. On the other hand, due to discrete constraints, how to effectively optimize the hashing network in an end-to-end manner with small quantization errors remains a problem. To address these difficulties, we propose a novel unsupervised hashing method called HARR to learn discriminative and high-quality hash codes. To comprehensively explore semantic similarity structure, HARR adopts the Winner-Take-All hash to model the similarity structure. Then similarity-preserving hash codes are learned under the reliable guidance of the reconstructed similarity structure. Additionally, we improve the quality of hash codes by a bit correlation reduction module, which forces the cross-correlation matrix between a batch of hash codes under different augmentations to approach the identity matrix. In this way, the generated hash bits are expected to be invariant to disturbances with minimal redundancy, which can be further interpreted as an instantiation of the information bottleneck principle. Finally, for effective hashing network training, we minimize the cosine distances between real-value network outputs and their binary codes for small quantization errors. Extensive experiments demonstrate the effectiveness of our proposed HARR.

Funder

NSFC

Shenzhen Key Technical Project

Guangdong International Science and Technology Cooperation Project

Shenzhen Fundamental Research Fund

Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies

National Natural Science Foundation of China

Shenzhen Colleges and Universities Stable Support Program

Shenzhen Science and Technology Program

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Networks and Communications,Hardware and Architecture

Link

https://dl.acm.org/doi/pdf/10.1145/3627162

Reference82 articles.

1. Learning representations for neural network-based classification using the information bottleneck principle;Amjad Rana Ali;IEEE Transactions on Pattern Analysis and Machine Intelligence,2019

2. Loopy residual hashing: Filling the quantization gap for image retrieval;Bai Jiale;IEEE Transactions on Multimedia,2019

3. Optimal rates of convergence for covariance matrix estimation;Cai T. Tony;Annals of Statistics,2010

4. Zhangjie Cao, Mingsheng Long, Jianmin Wang, and Philip S. Yu. 2017. HashNet: Deep learning to hash by continuation. In Proceedings of the IEEE/CVF International Conference on Computer Vision.

5. Mathilde Caron, Ishan Misra, Julien Mairal, Priya Goyal, Piotr Bojanowski, and Armand Joulin. 2020. Unsupervised learning of visual features by contrasting cluster assignments. In Proceedings of the Conference on Neural Information Processing Systems.