Ultra-fast genome-wide inference of pairwise coalescence times
Author:
Schweiger RegevORCID, Durbin RichardORCID
Abstract
AbstractThe pairwise sequentially Markovian coalescent (PSMC) algorithm and its extensions infer the coalescence time of two homologous chromosomes at each genomic position. This inference is utilized in reconstructing demographic histories, detecting selection signatures, genome-wide association studies, constructing ancestral recombination graphs and more. Inference of coalescence times between each pair of haplotypes in a large dataset is of great interest, as they may provide rich information about the population structure and history of the sample.We introduce a new method,Gamma-SMC, which is>14 times faster than current methods. To obtain this speed up, we represent the posterior coalescence time distributions succinctly as a Gamma distribution with just two parameters; while in PSMC and its extensions, these are held as a vector over discrete intervals of time. Thus, Gamma-SMC has constant time complexity per site, without dependence on a number of discrete time states. Additionally, due to this continuous representation, our method is able to infer times spanning many orders of magnitude, and as such is robust to parameter misspecification. We describe how this approach works, illustrate its performance on simulated and real data, and use it to study recent positive selection in the 1000 Genomes Project dataset.
Publisher
Cold Spring Harbor Laboratory
Reference29 articles.
1. A global reference for human genetic variation 2. Adrion, J.R. , Cole, C.B. , Dukler, N. , Galloway, J.G. , Gladstein, A.L. , Gower, G. , Kyriazis, C.C. , Ragsdale, A.P. , Tsambos, G. , Baumdicker, F. , Carlson, J. , Cartwright, R.A. , Durvasula, A. , Gronau, I. , Kim, B.Y. , McKenzie, P. , Messer, P.W. , Noskova, E. , Ortega-Del Vecchyo, D. , Racimo, F. , Struck, T.J. , Gravel, S. , Gutenkunst, R.N. , Lohmueller, K.E. , Ralph, P.L. , Schrider, D.R. , Siepel, A. , Kelleher, J. , Kern, A.D. : A communitymaintained standard library of population genetic models. Elife 9, (2020) 3. Ainsleigh, P.L. : Theory of continuous-state hidden Markov models and hidden Gauss-Markov models. Tech. rep., Naval Undersea Warfare Center Division, Newport, Rhode Island, (2001) 4. Dating genomic variants and shared ancestry in population-scale sequencing data 5. Baumdicker, F. , Bisschop, G. , Goldstein, D. , Gower, G. , Ragsdale, A.P. , Tsambos, G. , Zhu, S. , Eldon, B. , Ellerman, E.C. , Galloway, J.G. , Gladstein, A.L. , Gorjanc, G. , Guo, B. , Jeffery, B. , Kretzschumar, W.W. , Lohse, K. , Matschiner, M. , Nelson, D. , Pope, N.S. , Quinto-Cortés, C.D. , Rodrigues, M.F. , Saunack, K. , Sellinger, T. , Thornton, K. , van Kemenade, H. , Wohns, A.W. , Wong, Y. , Gravel, S. , Kern, A.D. , Koskela, J. , Ralph, P.L. , Kelleher, J. : Efficient ancestry and mutation simulation with msprime 1.0. Genetics 220(3), (2022)
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|