Abstract
AbstractHierarchical clustering of multivariate data usually provide useful information on the similarity among elements. Unfortunately, the clustering does not immediately suggest the data-governing structure. Moreover, the number of information retrieved by the data clustering can be sometimes so large to make the results little interpretable. This work presents two tools to derive relevant information from a large number of quantitative multivariate data, simply by post-processing the dendrograms resulting from hierarchical clustering. The first tool helps gaining a good insight in the physical relevance of the obtained clusters, i.e. whether the detected families of elements result from true or spurious similarities due to, e.g., experimental uncertainty. The second tool provides a deeper knowledge of the factors governing the distribution of the elements in the multivariate space, that is the determination of the most relevant parameters which affect the similarities among the configurations. These tools are, in particular, suitable to process experimental results to cope with related uncertainties, or to analyse multivariate data resulting from the study of complex or chaotic systems.
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Computer Networks and Communications,Hardware and Architecture,Information Systems,Software
Reference48 articles.
1. Aggarwal, C.C., & Yu, P.S. (2009). A survey of uncertain data algorithms and applications. IEEE Transactions on Knowledge and Data Engineering, 21 (5), 609–623.
2. Anderson, T.W. (1984). An Introduction to Multivariate Statistical Analysis. New York: John Wiley & Sons.
3. Bezdek, J. (1981). Pattern Recognition with Fuzzy Objective Function. New York: Plenum Press.
4. Biggs, N. (1993). Algebraic Graph Theory, Cambridge Mathematical Library (2nd ed.), Cambridge University Press.
5. Bouguettaya, A., Yu, Q., Liu, X., Zhou, X., & Song, A. (2015). Efficient agglomerative hierarchical clustering. Expert Systems with Applications, 42, 2785–2797.