Abstract
ABSTRACTBackgroundSNP datasets can be used to infer a wealth of information about natural populations, including information about their structure, genetic diversity, and the presence of loci under selection. However, SNP data analysis can be a time-consuming and challenging process, not in the least because at present many different software packages are needed to execute and depict the wide variety of mainstream population-genetic analyses. Here we present SambaR, an integrative and user-friendly R package which automates and simplifies quality control and population-genetic analyses of biallelic SNP datasets. SambaR allows users to perform mainstream population-genetic analyses and to generate a wide variety of ready to publish graphs with a minimum number of commands (less than ten). These wrapper commands call functions of existing packages (including adegenet, ape, LEA, poppr, pcadapt and StAMPP) as well as new tools uniquely implemented in SambaR.ResultsWe tested SambaR on online available SNP datasets and found that SambaR can process datasets of millions of SNPs and hundreds of individuals within hours, given sufficient computing power. Newly developed tools implemented in SambaR facilitate optimization of filter settings, objective interpretation of ordination analyses, enhance comparability of diversity estimates from reduced representation library SNP datasets, and generate reduced SNP panels and structure-like plots with Bayesian population assignment probabilities.ConclusionSambaR facilitates rapid population genetic analyses on biallelic SNP datasets by removing three major time sinks: file handling, software learning, and data plotting. In addition, SambaR provides a convenient platform for SNP data storage and management, as well as several new utilities, including guidance in setting appropriate data filters.Availability and implementationThe SambaR source script, manual and example datasets are distributed through GitHub: https://github.com/mennodejong1986/SambaR
Publisher
Cold Spring Harbor Laboratory
Reference66 articles.
1. Abel, G. J. (2019). migest: Methods for the Indirect Estimation of Bilateral Migration. https://CRAN.R-project.org/package=migest
2. Adler, D. , & Kelly, S. T. (2019). vioplot: Violin plot. https://github.com/TomKellyGenetics/vioplot
3. Fast model-based estimation of ancestry in unrelated individuals
4. Auguie, B. (2017). gridExtra: Miscellaneous Functions for “Grid” Graphics. https://CRAN.R-project.org/package=gridExtra
5. Analytical Bayesian Approach for Assigning Individuals to Populations