Abstract
AbstractThe Li and Stephens (LS) haplotype-copying model is a seminal framework that represents a target haplotype as an imperfect mosaic of a set of reference haplotypes. Using a hidden Markov model, it can switch from different source haplotypes to model recombinations. This model has been used in several applications in modern populations including phasing and inference of ancestry. However, recent publications have looked at the applicability of the model to using ancient individuals as targets and modern reference panels as source data. Previous research exploring the impact of time separation between the modern references and the ancient target on the model’s behavior relied on coalescent simulation to generate genetic variation data, which could lead to an underestimation of the ancient population’s genetic diversity. Further, these simulations were restricted to a relatively short time period of anatomically modern human history. To overcome these limitations, our study evaluates the robustness of the LS model using forward-simulated data enabling us to sample haplotypes that do not have direct descendants among the modern population. Additionally, we evaluate the model under the simple demographic scenario of a constant-sized continuous population starting 1.5M years ago to isolate the effect of time separation. Results indicate good performance for target haplotypes up to 900,000 years old, suggesting potential applicability to ancient DNA (aDNA) from anatomically modern humans. Although more complex demographic scenarios should be considered for a definitive answer, this research serves as a starting point for evaluating the haplotype-copying framework in aDNA data analysis.
Publisher
Cold Spring Harbor Laboratory