Author:
Labiod Lazhar,Nadif Mohamed
Abstract
AbstractThe k-means algorithm and some k-means variants have been shown to be useful and effective to tackle the clustering problem. In this paper we embed k-means variants in a bi-stochastic matrix approximation (BMA) framework. Then we derive from the k-means objective function a new formulation of the criterion. In particular, we show that some k-means variants are equivalent to algebraic problem of bi-stochastic matrix approximation under some suitable constraints. For optimizing the derived objective function, we develop two algorithms; the first one consists in learning a bi-stochastic similarity matrix while the second seeks for the optimal partition which is the equilibrium state of a Markov chain process. Numerical experiments on real data-sets demonstrate the interest of our approach.
Publisher
Springer International Publishing
Reference12 articles.
1. De Soete, G., Carroll, J. D.: K-means clustering in a low-dimensional euclidean space. In: E. Diday et al. (eds.) New Approaches in Classification and Data Analysis, pp. 212–219. Springer-Verlag Berlin (1994)
2. Dhillon, I. S.: Co-clustering documents and words using bipartite spectral graph partitioning. In SIGKDD, pp. 269–274 (2001)
3. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix trifactorizations for clustering. In SIGKDD, pp. 126–135 (2006)
4. Golub, G. H., van Loan, C. F.: Matrix Computations (3rd ed.). Johns Hopkins University Press (1996)
5. Lim, D., Vidal, R., Haeffele, B. D.: Doubly stochastic subspace clustering. ArXiv, abs/2011.14859, 2020. Available via ArXiv. https://arxiv.org/abs/2011.14859