Abstract
AbstractThe minimum sum-of-squares clustering (MSSC), or k-means type clustering, has been recently extended to exploit prior knowledge on the cardinality of each cluster. Such knowledge is used to increase performance as well as solution quality. In this paper, we propose a global optimization approach based on the branch-and-cut technique to solve the cardinality-constrained MSSC. For the lower bound routine, we use the semidefinite programming (SDP) relaxation recently proposed by Rujeerapaiboon et al. (SIAM J Optim 29(2):1211–1239, 2019). However, this relaxation can be used in a branch-and-cut method only for small-size instances. Therefore, we derive a new SDP relaxation that scales better with the instance size and the number of clusters. In both cases, we strengthen the bound by adding polyhedral cuts. Benefiting from a tailored branching strategy which enforces pairwise constraints, we reduce the complexity of the problems arising in the children nodes. For the upper bound, instead, we present a local search procedure that exploits the solution of the SDP relaxation solved at each node. Computational results show that the proposed algorithm globally solves, for the first time, real-world instances of size 10 times larger than those solved by state-of-the-art exact methods.
Funder
Università degli Studi di Roma La Sapienza
Publisher
Springer Science and Business Media LLC
Subject
General Mathematics,Software
Reference58 articles.
1. Rao, M.: Cluster analysis and mathematical programming. J. Am. Stat. Assoc. 66(335), 622–626 (1971)
2. Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms, and Applications, 2nd edn. Society for Industrial and Applied Mathematics, Philadelphia (2020)
3. Aloise, D., Deshpande, A., Hansen, P., Popat, P.: NP-hardness of Euclidean sum-of-squares clustering. Mach. Learn. 75, 245–248 (2009)
4. Rujeerapaiboon, N., Schindler, K., Kuhn, D., Wiesemann, W.: Size matters: Cardinality-constrained clustering and outlier detection via conic optimization. SIAM J. Optim. 29(2), 1211–1239 (2019)
5. Davidson, I., Basu, S.: A survey of clustering with instance level constraints. ACM Trans. Knowl. Discov. Data. 1(1–41), 2–42 (2007)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献