Abstract
AbstractR-mode hierarchical clustering is a method for forming hierarchical groups of mutually exclusive subsets of variables. This R-mode cluster method identifies interrelationships between variables which are useful for variable selection and dimension reduction. Importantly, the method is based on metric elements defined on the sample space of variables. Consequently, hierarchical clustering of compositional parts should respect the particular geometry of the simplex. In this work, the connections between concepts such as distance, cluster representative, compositional biplot, and log-ratio basis are explored within the framework of the most popular R-mode agglomerative hierarchical clustering methods. The approach is illustrated in a paleoecological study to identify groups of species sharing similar behavior.
Funder
Ministerio de Ciencia e InnovaciÓn
AgÈncia de GestiÓ d’Ajuts Universitaris i de Recerca
Publisher
Springer Science and Business Media LLC
Subject
General Earth and Planetary Sciences,Mathematics (miscellaneous)
Reference33 articles.
1. Aitchison J (1986) The statistical analysis of compositional data. In: Monographs on statistics and applied probability. Chapman and Hall Ltd. (Reprinted in 2003 by Blackburn Press)
2. Aitchison J (1997) The one-hour course in compositional data analysis or compositional data analysis is simple. In: Pawlowsky-Glahn V (ed) Proceedings of IAMG’97—The third annual conference of the International Association for Mathematical Geology. International Center for Numerical Methods in Engineering (CIMNE), Barcelona, Spain pp 3–35
3. Aitchison J, Greenacre M (2002) Biplots of compositional data. J R Stat Soc Ser C (Appl Stat) 51:375–392
4. Aitchison J, Barceló-Vidal C, Martín-Fernández JA, Pawlowsky-Glahn V (2000) Logratio analysis and compositional distance. Math Geol 32(3):271–275
5. Barceló-Vidal C, Martín-Fernández JA (2016) The mathematics of compositional analysis. Aust J Stat 45(4):57–71