Abstract
AbstractThis work studies the problem of clustering one-dimensional data points such that they are evenly distributed over a given number of low variance clusters. One application is the visualization of data on choropleth maps or on business process models, but without over-emphasizing outliers. This enables the detection and differentiation of smaller clusters. The problem is tackled based on a heuristic algorithm called DDCAL (1d distribution cluster algorithm) that is based on iterative feature scaling which generates stable results of clusters. The effectiveness of the DDCAL algorithm is shown based on 5 artificial data sets with different distributions and 4 real-world data sets reflecting different use cases. Moreover, the results from DDCAL, by using these data sets, are compared to 11 existing clustering algorithms. The application of the DDCAL algorithm is illustrated through the visualization of pandemic and population data on choropleth maps as well as process mining results on process models.
Funder
Technische Universität München
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Statistics, Probability and Uncertainty,Psychology (miscellaneous),Mathematics (miscellaneous)
Reference41 articles.
1. Al-Kassab, J., Ouertani, Z. M., Schiuma, G., & Neely, A. (2014). Information visualization to support management decisions. International Journal of Information Technology & Decision Making, 13(02), 407–428.
2. Arthur, D., & Vassilvitskii, S. (2006). K-means++: The advantages of careful seeding. Stanford: (Tech. Rep.)
3. Arthur, D., & Vassilvitskii, S. (2007). k-means++: The advantages of careful seeding. In Symposium on discrete algorithms symposium on discrete algorithms (pp. 1027–1035).
4. Bernard, G., & Andritsos, P. (2019). Discovering customer journeys from evidence: A genetic approach inspired by process mining. In CAiSE forum caise forum (pp. 36–47).
5. Bonner, R. E. (1964). On some clustering techniques on some clustering techniques. IBM Journal of Research and Development, 81(1), 22–32.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献