Unsupervised Hashing with Semantic Concept Mining-Reference-Cited by-同舟云学术

Unsupervised Hashing with Semantic Concept Mining

Published:2023-05-26 Issue:1 Volume:1 Page:1-19
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Tu Rong-Cheng¹^ORCID,Mao Xian-Ling²^ORCID,Lin Kevin Qinghong³^ORCID,Cai Chengfei⁴^ORCID,Qin Weize⁵^ORCID,Wei Wei⁶^ORCID,Wang Hongfa⁵^ORCID,Huang Heyan²^ORCID

Affiliation:

1. Beijing Institute of Technology, Beijing Engineering Research Center of High Volume Language Information Processing and Cloud Computing Applications, & Beijing Institute of Technology Southeast Academy of Information Technology, Beijing, China

2. Beijing Institute of Technology, Beijing, China

3. National University of Singapore, Singapore, Singapore

4. Zhejiang University, Zhejiang, China

5. Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China

6. Huazhong University of Science and Technology, Wuhan, China

Abstract

Recently, to improve the unsupervised image retrieval performance, plenty of unsupervised hashing methods have been proposed by designing a semantic similarity matrix, which is based on the similarities between image features extracted by a pre-trained CNN model. However, most of these methods tend to ignore high-level abstract semantic concepts contained in images. Intuitively, concepts play an important role in calculating the similarity among images. In real-world scenarios, each image is associated with some concepts, and the similarity between two images will be larger if they share more identical concepts. Inspired by the above intuition, in this work, we propose a novel Unsupervised Hashing with Semantic Concept Mining, called UHSCM, which leverages a VLP model to construct a high-quality similarity matrix. Specifically, a set of randomly chosen concepts is first collected. Then, by employing a vision-language pretraining (VLP) model with the prompt engineering which has shown strong power in visual representation learning, the set of concepts is denoised according to the training images. Next, the proposed method UHSCM applies the VLP model with prompting again to mine the concept distribution of each image and construct a high-quality semantic similarity matrix based on the mined concept distributions. Finally, with the semantic similarity matrix as guiding information, a novel hashing loss with a modified contrastive loss based regularization item is proposed to optimize the hashing network. Extensive experiments on three benchmark datasets show that the proposed method outperforms the state-of-the-art baselines in the image retrieval task.

Funder

National Natural Science Foundation of China

National Key R\&D Plan

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3588683

Reference65 articles.

1. Hassan Akbari , Linagzhe Yuan , Rui Qian , Wei-Hong Chuang , Shih-Fu Chang , Yin Cui , and Boqing Gong . 2021 . Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text. arXiv preprint arXiv:2104.11178 (2021). Hassan Akbari, Linagzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, and Boqing Gong. 2021. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text. arXiv preprint arXiv:2104.11178 (2021).

2. Deep Cauchy Hashing for Hamming Space Retrieval

3. HashNet: Deep Learning to Hash by Continuation

4. NUS-WIDE

5. Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2018 . Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018). Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Discrepancy and Structure-Based Contrast for Test-Time Adaptive Retrieval;IEEE Transactions on Multimedia;2024

2. Exploring Hierarchical Information in Hyperbolic Space for Self-Supervised Image Hashing;IEEE Transactions on Image Processing;2024