Author:
Fränti Pasi,Sieranoja Sami
Abstract
<abstract>
<p>Clustering accuracy (ACC) is one of the most often used measures in literature to evaluate clustering quality. However, the measure is often used without any definition or reference to such a definition. In this paper, we identify the origin of the measure. We give a proper definition for the measure and provide a simple bug fix which allows it to be used also in the case of a mismatch in the number of clusters. We show that the measure belongs to a wider class of set-matching based measures. We compare its properties to centroid index (CI) and normalized mutual information (NMI).</p>
</abstract>
Publisher
American Institute of Mathematical Sciences (AIMS)
Reference48 articles.
1. S. van Dongen, Performance criteria for graph clustering and Markov cluster experiments, Amsterdam: Centrum voor Wiskunde en Informatica, 2000.
2. M. Meila, D. Heckerman, An experimental comparison of model based clustering methods, Mach. Learn., 41 (2001), 9–29. https://doi.org/10.1023/A:1007648401407
3. M. Rezaei, P. Fränti, Set matching measures for external cluster validity, IEEE Trans. Knowl. Data Eng., 28 (2016), 2173–2186. https://doi.org/10.1109/TKDE.2016.2551240
4. P. Fränti, Genetic algorithm with deterministic crossover for vector quantization, Pattern Recogn. Lett., 21 (2000), 61–68. https://doi.org/10.1016/S0167-8655(99)00133-6
5. P. Fränti, Efficiency of random swap clustering, J. Big Data, 5 (2018), 13. https://doi.org/10.1186/s40537-018-0122-y