Affiliation:
1. School of Cyberspace Security, Beijing University of Posts and Telecommunications, Beijing 100876, China
2. Intelligent Policing Key Laboratory of Sichuan Province, Sichuan Police College, Luzhou 646000, China
Abstract
The aim of our paper is to introduce a new deep clustering model called Multi-head Cross-Attention Contrastive Clustering (Multi-CC), which seeks to enhance the performance of the existing deep clustering model CC. Our approach involves first augmenting the data to form image pairs and then using the same backbone to extract the feature representation of these image pairs. We then undertake contrastive learning, separately in the row space and column space of the feature matrix, to jointly learn the instance and cluster representations. Our approach offers several key improvements over the existing model. Firstly, we use a mixed strategy of strong and weak augmentation to construct image pairs. Secondly, we get rid of the pooling layer of the backbone to prevent loss of information. Finally, we introduce a multi-head cross-attention module to improve the model’s performance. These improvements have allowed us to reduce the model training time by 80%. As a baseline, Multi-CC achieves the best results on CIFAR-10, ImageNet-10, and ImageNet-dogs. It is easily replaceable with CC, making models based on CC achieve better performance.
Funder
National Key R&D Program of China
Opening Project of Intelligent Policing Key Laboratory of Sichuan Province
National Natural Science Foundation of China
111 Project
Open Foundation of Guizhou Provincial Key Laboratory of Public Big Data
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference32 articles.
1. A Survey of Clustering with Deep Learning: From the Perspective of Network Architecture;Min;IEEE Access,2018
2. A k-means clustering algorithm;Hartigan;J. R. Stat. Soc. Ser. C (Appl. Stat.),1979
3. Yang, J., Parikh, D., and Batra, D. (2016, January 27–30). Joint Unsupervised Learning of Deep Representations and Image Clusters. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA.
4. Huang, P., Huang, Y., Wang, W., and Wang, L. (2014, January 24–28). Deep Embedding Network for Clustering. Proceedings of the 2014 22nd International Conference on Pattern Recognition, Stockholm, Sweden.
5. SPICE: Semantic Pseudo-Labeling for Image Clustering;Niu;IEEE Trans. Image Process.,2021