Compressed K-Means for Large-Scale Clustering-Reference-Cited by-同舟云学术

Compressed K-Means for Large-Scale Clustering

Published:2017-02-13 Issue:1 Volume:31 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Shen Xiaobo,Liu Weiwei,Tsang Ivor,Shen Fumin,Sun Quan-Sen

Abstract

Large-scale clustering has been widely used in many applications, and has received much attention. Most existing clustering methods suffer from both expensive computation and memory costs when applied to large-scale datasets. In this paper, we propose a novel clustering method, dubbed compressed k-means (CKM), for fast large-scale clustering. Specifically, high-dimensional data are compressed into short binary codes, which are well suited for fast clustering. CKM enjoys two key benefits: 1) storage can be significantly reduced by representing data points as binary codes; 2) distance computation is very efficient using Hamming metric between binary codes. We propose to jointly learn binary codes and clusters within one framework. Extensive experimental results on four large-scale datasets, including two million-scale datasets demonstrate that CKM outperforms the state-of-the-art large-scale clustering methods in terms of both computation and memory cost, while achieving comparable clustering accuracy.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. NetHD: Neurally Inspired Integration of Communication and Learning in Hyperspace;Advanced Intelligent Systems;2024-05-26

2. Large-Scale Clustering on 100 M-Scale Datasets Using a Single T4 GPU via Recall KNN and Subgraph Segmentation;Neural Processing Letters;2024-02-15

3. Multi-view clustering via optimal transport algorithm;Knowledge-Based Systems;2023-11

4. Deep Fair Clustering via Maximizing and Minimizing Mutual Information: Theory, Algorithm and Metric;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06

5. An Effective and Efficient Algorithm for K-Means Clustering With New Formulation;IEEE Transactions on Knowledge and Data Engineering;2023-04-01