Abstract
AbstractThe 19 standard bioclimatic variables available from the Worldclim dataset are some of the most used data in ecology and organismal biology. It is well know that many of the variables are correlated with each other, suggesting there is less than 19 independent dimensions of information in them. But how much information is there? Here I explore the 19 Worldclim bioclimatic variables from the perspective of the manifold hypothesis: that many high dimensional datasets are actually confined to a lower dimensional manifold embedded in an ambient space. Using a state of the art generative probabilistic model (variational autoencoder) to model the data on a non-linear manifold reveals that only 5 uncorrelated dimensions are adequate to capture the full range of variation in the bioclimatic variables. I show that these 5 variables have meaningful structure and are sufficient to produce species distribution models (SDMs) nearly as good and in some ways better than SDMs using the original 19 bioclimatic variables. I have made the 5 synthetic variables available as a raster dataset at 2.5 minute resolution in an R package that also includes functions to convert back and forth between the 5 variables and the original 19 (https://github.com/rdinnager/biocman).
Publisher
Cold Spring Harbor Laboratory
Reference25 articles.
1. C. J. Battey , Gabrielle C. Coffing , and Andrew D. Kern . Visualizing population structure with variational autoencoders. G3: Genes—Genomes—Genetics, 11, 2020.
2. Representation Learning: A Review and New Perspectives;IEEE Transactions on Pattern Analysis and Machine Intelligence,2012
3. Trevor H. Booth . Checking bioclimatic variables that combine temperature and precipitation data before their use in species distribution models. Austral Ecology, 2022.
4. Trevor H. Booth , Henry Nix , John R. Busby , and Michael F. Hutchinson. bioclim: the first species distri-bution modelling package, its early applications and relevance to most current MaxEnt studies. Diversity and Distributions, 20, 2014.
5. The Scree Test For The Number Of Factors;Multivariate behavioral research,1966