Author:
Schumer Molly,Powell Daniel L.,Corbett-Detig Russ
Abstract
AbstractIt is now clear that hybridization between species is much more common than previously recognized. As a result, we now know that the genomes of many modern species, including our own, are a patchwork of regions derived from past hybridization events. Increasingly researchers are interested in disentangling which regions of the genome originated from each parental species using local ancestry inference methods. Due to the diverse effects of admixture, this interest is shared across disparate fields, from human genetics to research in ecology and evolutionary biology. However, local ancestry inference methods are sensitive to a range of biological and technical parameters which can impact accuracy. Here we present paired simulation and ancestry inference pipelines,mixnmatchandancestryinfer, to help researchers plan and execute local ancestry inference studies.mixnmatchcan simulate arbitrarily complex demographic histories in the parental and hybrid populations, selection on hybrids, and technical variables such as coverage and contamination.ancestryinfertakes as input sequencing reads from simulated or real individuals, and implements an efficient local ancestry inference pipeline. We perform a series of simulations withmixnmatchto pinpoint factors that influence accuracy in local ancestry inference and highlight useful features of the two pipelines. Together,mixnmatchandancestryinferare powerful tools for predicting the performance of local ancestry inference methods on real data.
Publisher
Cold Spring Harbor Laboratory