Inferring the ancestry of everyone-Reference-Cited by-同舟云学术

Inferring the ancestry of everyone

Published:2018-11-01 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Kelleher Jerome^ORCID,Wong Yan^ORCID,Albers Patrick K.^ORCID,Wohns Anthony W.^ORCID,McVean Gil^ORCID

Abstract

AbstractA central problem in evolutionary biology is to infer the full genealogical history of a set of DNA sequences. This history contains rich information about the forces that have influenced a sexually reproducing species. However, existing methods are limited: the most accurate is unable to cope with more than a few dozen samples. With modern genetic data sets rapidly approaching millions of genomes, there is an urgent need for efficient inference methods to exploit such rich resources. We introduce an algorithm to infer whole-genome history which has comparable accuracy to the state-of-the-art but can process around four orders of magnitude more sequences. Additionally, our method results in an “evolutionary encoding” of the original sequence data, enabling efficient access to genealogies and calculation of genetic statistics over the data. We apply this technique to human data from the 1000 Genomes Project, Simons Genome Diversity Project and UK Biobank, showing that the genealogies we estimate are both rich in biological signal and efficient to process.

Publisher

Cold Spring Harbor Laboratory

Reference52 articles.

1. A global reference for human genetic variation

2. Dating genomic variants and shared ancestry in population-scale sequencing data

3. The importance and application of the ancestral recombination graph;Fron Genet,2013

4. On the computational complexity of the rooted subtree prune and regraft distance;Annals of combinatorics,2005

5. C. Bycroft , C. Freeman , D. Petkova , G. Band , L. T. Elliott , K. Sharp , A. Motyer , D. Vukcevic , O. Delaneau , J. O’Connell , et al. The UK Biobank resource with deep phenotyping and genomic data. Nature, (562):203–209, 2018.

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Compression for population genetic data through finite-state entropy;2021-02-18

2. The spatiotemporal spread of human migrations during the European Holocene;Proceedings of the National Academy of Sciences;2020-04-01

3. Comparing Phylogeographies: Incompatible Geographical Histories in Pathogens’ Genomes;2020-01-11

4. An approximate full-likelihood method for inferring selection and allele frequency trajectories from DNA sequence data;PLOS Genetics;2019-09-13

5. A method for genome-wide genealogy estimation for thousands of samples;Nature Genetics;2019-09