Exploratory Analysis of Provenance Data Using R and the Provenance Package-Reference-Cited by-同舟云学术

Exploratory Analysis of Provenance Data Using R and the Provenance Package

Published:2019-03-22 Issue:3 Volume:9 Page:193
ISSN:2075-163X
Container-title:Minerals
language:en
Short-container-title:Minerals

Author:

Vermeesch Pieter^ORCID

Abstract

The provenance of siliclastic sediment may be traced using a wide variety of chemical, mineralogical and isotopic proxies. These define three distinct data types: (1) compositional data such as chemical concentrations; (2) point-counting data such as heavy mineral compositions; and (3) distributional data such as zircon U-Pb age spectra. Each of these three data types requires separate statistical treatment. Central to any such treatment is the ability to quantify the `dissimilarity’ between two samples. For compositional data, this is best done using a logratio distance. Point-counting data may be compared using the chi-square distance, which deals better with missing components (zero values) than the logratio distance does. Finally, distributional data can be compared using the Kolmogorov–Smirnov and related statistics. For small datasets using a single provenance proxy, data interpretation can sometimes be done by visual inspection of ternary diagrams or age spectra. However, this no longer works for larger and more complex datasets. This paper reviews a number of multivariate ordination techniques to aid the interpretation of such studies. Multidimensional Scaling (MDS) is a generally applicable method that displays the salient dissimilarities and differences between multiple samples as a configuration of points in which similar samples plot close together and dissimilar samples plot far apart. For compositional data, classical MDS analysis of logratio data is shown to be equivalent to Principal Component Analysis (PCA). The resulting MDS configurations can be augmented with compositional information as biplots. For point-counting data, classical MDS analysis of chi-square distances is shown to be equivalent to Correspondence Analysis (CA). This technique also produces biplots. Thus, MDS provides a common platform to visualise and interpret all types of provenance data. Generalising the method to three-way dissimilarity tables provides an opportunity to combine several datasets together and thereby facilitate the interpretation of `Big Data’. This paper presents a set of tutorials using the statistical programming language R. It illustrates the theoretical underpinnings of compositional data analysis, PCA, MDS and other concepts using toy examples, before applying these methods to real datasets with the provenance package.

Publisher

MDPI AG

Subject

Geology,Geotechnical Engineering and Engineering Geology

Link

https://www.mdpi.com/2075-163X/9/3/193/pdf

Reference68 articles.

1. Combined U–Pb and Hf isotope LA-(MC-)ICP-MS analyses of detrital zircons: Comparison with SHRIMP and new constraints for the provenance and age of an Armorican metasediment in Central Germany

2. Sediment provenance;Mazumder,2017

3. Geochemical studies of detrital heavy minerals and their application to provenance research;Morton,1991

4. The provenance of Taklamakan desert sand

5. Quantitative provenance analysis of sediments: review and outlook

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The geochemistry of fluvial sediments from large rivers: Old problems and new developments;Reference Module in Earth Systems and Environmental Sciences;2024

2. Correction: Vermeesch, P. Exploratory Analysis of Provenance Data Using R and the Provenance Package. Minerals 2019, 9, 193;Minerals;2023-03-08

3. Seasonal variation in morphotype composition of pelagic Sargassum influx events is linked to oceanic origin;Scientific Reports;2023-03-07

4. Multidimensional Scaling of Varietal Data in Sedimentary Provenance Analysis;Journal of Geophysical Research: Earth Surface;2023-03

5. Late Paleozoic cratonal sink: Distally sourced sediment filled the Anadarko Basin (USA) from multiple source regions;Geosphere;2022-11-04