Author:
Li Yeqing,Huang Junzhou,Liu Wei
Abstract
In the past decades, Spectral Clustering (SC) has become one of the most effective clustering approaches. Although it has been widely used, one significant drawback of SC is its expensive computation cost. Many efforts have been devoted to accelerating SC algorithms and promising results have been achieved. However, most of the existing algorithms rely on the assumption that data can be stored in the computer memory. When data cannot fit in the memory, these algorithms will suffer severe performance degradations. In order to overcome this issue, we propose a novel sequential SC algorithm for tackling large-scale clustering with limited computational resources, \textit{e.g.}, memory. We begin with investigating an effective way of approximating the graph affinity matrix via leveraging a bipartite graph. Then we choose a smart graph construction and optimization strategy to avoid random access to data. These efforts lead to an efficient SC algorithm whose memory usage is independent of the number of input data points. Extensive experiments carried out on large datasets demonstrate that the proposed sequential SC algorithm is up to a thousand times faster than the state-of-the-arts.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Learning Structure Aware Deep Spectral Embedding;IEEE Transactions on Image Processing;2023
2. Centerless Clustering;IEEE Transactions on Pattern Analysis and Machine Intelligence;2023-01-01
3. Efficient and Robust MultiView Clustering With Anchor Graph Regularization;IEEE Transactions on Circuits and Systems for Video Technology;2022-09
4. Improving Spectral Clustering Using Spectrum-Preserving Node Aggregation;2022 26th International Conference on Pattern Recognition (ICPR);2022-08-21
5. Robust landmark graph-based clustering for high-dimensional data;Neurocomputing;2022-07