Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces

Author:

Pavlů IvanaORCID,Machalová JitkaORCID,Tolosana-Delgado RaimonORCID,Hron KarelORCID,Bachmann Kai,van den Boogaart Karl GeraldORCID

Abstract

AbstractDistributional data have recently become increasingly important for understanding processes in the geosciences, thanks to the establishment of cost-efficient analytical instruments capable of measuring properties over large numbers of particles, grains or crystals in a sample. Functional data analysis allows the direct application of multivariate methods, such as principal component analysis, to such distributions. However, these are often observed in the form of samples, and thus incur a sampling error. This additional sampling error changes the properties of the multivariate variance and thus the number of relevant principal components and their direction. The result of the principal component analysis becomes an artifact of the sampling error and can negatively affect the subsequent data analysis. This work presents a way of estimating this sampling error and how to confront it in the context of principal component analysis, where the principal components are obtained as a linear combination of elements of a newly constructed orthogonal spline basis. The effect of the sampling error and the effectiveness of the correction is demonstrated with a series of simulations. It is shown how the interpretability and reproducibility of the principal components improve and become independent of the selection of the basis. The proposed method is then applied on a dataset of grain size distributions in a geometallurgical dataset from Thaba mine in the Bushveld complex.

Funder

HiTEc Cost Action

Univerzita Palackého v Olomouci

Grantová Agentura České Republiky

Spanish Ministry of Science and Innovation

Publisher

Springer Science and Business Media LLC

Reference25 articles.

1. Bachmann K (2020) Predictive geometallurgical modelling. Ph.D. thesis, Techniche Universität Bergakademie Freiberg

2. Bortolotti T (2021) Weighted functional data analysis for partially observed seimic data: an application to ground motion modelling in Italy. Ph.D. thesis, Politecnico Di Milano

3. De Boor C (1978) A practical guide to splines. Springer, New York

4. Doob JL (1935) The limiting distributions of certain statistics. Ann Math Stat 6(3):160–169

5. Egozcue J, Díaz-Barrero J, Pawlowsky-Glahn V (2006) Hilbert space of probability density functions based on Aitchison geometry. Acta Math Sinica 22:1175–1182

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3