Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces-Reference-Cited by-同舟云学术

Principal Component Analysis for Distributions Observed by Samples in Bayes Spaces

Published:2024-05-03 Issue: Volume: Page:
ISSN:1874-8961
Container-title:Mathematical Geosciences
language:en
Short-container-title:Math Geosci

Author:

Pavlů Ivana^ORCID,Machalová Jitka^ORCID,Tolosana-Delgado Raimon^ORCID,Hron Karel^ORCID,Bachmann Kai,van den Boogaart Karl Gerald^ORCID

Abstract

AbstractDistributional data have recently become increasingly important for understanding processes in the geosciences, thanks to the establishment of cost-efficient analytical instruments capable of measuring properties over large numbers of particles, grains or crystals in a sample. Functional data analysis allows the direct application of multivariate methods, such as principal component analysis, to such distributions. However, these are often observed in the form of samples, and thus incur a sampling error. This additional sampling error changes the properties of the multivariate variance and thus the number of relevant principal components and their direction. The result of the principal component analysis becomes an artifact of the sampling error and can negatively affect the subsequent data analysis. This work presents a way of estimating this sampling error and how to confront it in the context of principal component analysis, where the principal components are obtained as a linear combination of elements of a newly constructed orthogonal spline basis. The effect of the sampling error and the effectiveness of the correction is demonstrated with a series of simulations. It is shown how the interpretability and reproducibility of the principal components improve and become independent of the selection of the basis. The proposed method is then applied on a dataset of grain size distributions in a geometallurgical dataset from Thaba mine in the Bushveld complex.

Funder

HiTEc Cost Action

Univerzita Palackého v Olomouci

Grantová Agentura České Republiky

Spanish Ministry of Science and Innovation

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11004-024-10142-9.pdf

Reference25 articles.

1. Bachmann K (2020) Predictive geometallurgical modelling. Ph.D. thesis, Techniche Universität Bergakademie Freiberg

2. Bortolotti T (2021) Weighted functional data analysis for partially observed seimic data: an application to ground motion modelling in Italy. Ph.D. thesis, Politecnico Di Milano

3. De Boor C (1978) A practical guide to splines. Springer, New York

4. Doob JL (1935) The limiting distributions of certain statistics. Ann Math Stat 6(3):160–169

5. Egozcue J, Díaz-Barrero J, Pawlowsky-Glahn V (2006) Hilbert space of probability density functions based on Aitchison geometry. Acta Math Sinica 22:1175–1182

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Enhanced coalbed methane well production prediction framework utilizing the CNN-BL-MHA approach;Scientific Reports;2024-06-26