Abstract
AbstractGood empirical applications of geometric morphometrics (GMM) typically involve several times more variables than specimens, a situation the statistician refers to as “highp/n,” wherepis the count of variables andnthe count of specimens. This note calls your attention to two predictable catastrophic failures of one particular multivariate statistical technique, between-groups principal components analysis (bgPCA), in this high-p/nsetting. The more obvious pathology is this: when applied to the patternless (null) model ofpidentically distributed Gaussians over groups of the same size, both bgPCA and its algebraic equivalent, partial least squares (PLS) analysis against group, necessarily generate the appearance of huge equilateral group separations that are actually fictitious (absent from the statistical model). When specimen counts by group vary greatly or when any group includes fewer than about ten specimens, an even worse failure of the technique obtains: the smaller the group, the more likely a bgPCA is to fictitiously identify that group as the end-member of one of its derived axes. For these two reasons, when used in GMM and other high-p/nsettings the bgPCA method very often leads to invalid or insecure bioscientific inferences. This paper demonstrates and quantifies these and other pathological outcomes both for patternless models and for models with one or two valid factors, then offers suggestions for how GMM practitioners should protect themselves against the consequences for inference of these lamentably predictable misrepresentations. The bgPCA method should never be used unskeptically — it is never authoritative — and whenever it appears in partial support of any biological inference it must be accompanied by a wide range of diagnostic plots and other challenges, many of which are presented here for the first time.
Publisher
Cold Spring Harbor Laboratory
Reference25 articles.
1. Bookstein, F. L. Discussion: modeling and method. Pp. II:317–321 in Systems under Indirect Observation: Causality, Structure, Prediction, ed. H. Wold and K. Jöreskog . Amsterdam: North-Holland, 1982.
2. Bookstein, F. L. Measuring and Reasoning: Numerical Inference in the Science. Cambridge University Press, 2014.
3. A newly noticed formula enforces fundamental limits on geometric morphometric analyses;Evolutionary Biology,2017
4. Bookstein, F. L. A Course of Morphometrics for Biologists. Cambridge University Press, 2018.
5. Bookstein, F. L. Reflections on the biometrics of organismal form. Biological Theory, https://doi.org/10.1007/s13752-019-00320-y, posted 4/29/2019.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献