Abstract
Various criteria and algorithms can be used for clustering, leading to very distinct outcomes and potential biases towards datasets with certain structures. More generally, the selection of the most effective algorithm to be applied for a given dataset, based on its characteristics, is a problem that has been largely studied in the field of meta-learning. Recent advances in the form of a new methodology known as Instance Space Analysis provide an opportunity to extend such meta-analyses to gain greater visual insights of the relationship between datasets’ characteristics and the performance of different algorithms. The aim of this study is to perform an Instance Space Analysis for the first time for clustering problems and algorithms. As a result, we are able to analyze the impact of the choice of the test instances employed, and the strengths and weaknesses of some popular clustering algorithms, for datasets with different structures.
Funder
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Australian Research Council
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Reference60 articles.
1. Mathematics of Data Science: A Computational Approach to Clustering and Classification;Calvetti,2020
2. Cross-disciplinary perspectives on meta-learning for algorithm selection
3. Meta-learning: A survey;Vanschoren;arXiv,2018
4. Instance spaces for machine learning classification
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献