Identity-by-descent segments in large samples-Reference-Cited by-同舟云学术

Identity-by-descent segments in large samples

Published:2024-06-08 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Temple Seth D.^ORCID,Thompson Elizabeth A.

Abstract

AbstractIf two haplotypes share the same alleles for an extended gene tract, these haplotypes are likely to derive identical-by-descent from a recent common ancestor. Identity-by-descent segment lengths are correlated via unobserved tree and recombination processes, which commonly presents challenges to the derivation of theoretical results in population genetics. Under interpretable regularity conditions, we show that the proportion of detectable identity-by-descent segments at a locus is normally distributed for large sample size and large scaled population size. We use efficient and exact simulations to study the distributional behavior of the detectable identity-by-descent rate in finite samples. One consequence of non-normality in finite samples is that genome-wide scans based on identity-by-descent rates may be subject to anti-conservative Type 1 error control.Highlights

We show the asymptotic normality of the identity-by-descent rate, a mean of correlated binary random variables that arises in population genetics studies.

We describe an efficient algorithm capable of simulating long identity-by-descent segments around a locus in large sample sizes.

In enormous simulation studies, we use this algorithm to characterize the distributional properties of the identity-by-descent rate.

In finite samples, we reject the null hypothesis of normality more often than the nominal significance level, indicating that genome-wide scans based on identity-by-descent rates may be anti-conservative.

Publisher

Cold Spring Harbor Laboratory

Reference36 articles.

1. F. Baumdicker , G. Bisschop , D. Goldstein , G. Gower , A. P. Ragsdale , G. Tsambos , S. Zhu , B. Eldon , E. C. Ellerman , J. G. Galloway , A. L. Gladstein , G. Gorjanc , B. Guo , B. Jeffery , W. W. Kretzschumar , K. Lohse , M. Matschiner , D. Nelson , N. S. Pope , C. D. Quinto-Cortés , M. F. Rodrigues , K. Saunack , T. Sellinger , K. Thornton , H. van Kemenade , A. W. Wohns , Y. Wong , S. Gravel , A. D. Kern , J. Koskela , P. L. Ralph , and J. Kelleher . Efficient ancestry and mutation simulation with msprime 1.0. Genetics, 220 (3), Mar. 2022.

2. Distortion of genealogical properties when the sample is very large;Proc. Natl. Acad. Sci. U. S. A,2390

3. Fast two-stage phasing of large-scale sequence data

4. Accurate Non-parametric Estimation of Recent Effective Population Size from Segments of Identity by Descent

5. Probabilistic Estimation of Identity by Descent Segment Endpoints and Detection of Recent Selection