Abstract
AbstractA number of popular methods for inferring the evolutionary relationship between populations require essentially two components: First, they require estimates off2-statistics, or some quantity that is a linear combination of these. Second, they require estimates of the variability of the statistic in question. Examples of methods in this class include qpGraph and TreeMix.It is known, however, that these statistics are biased when based on genotype calls at low depth. Moreover, as we show, this leads to downstream inference of significantly distorted trees. To solve this problem, we demonstrate how to accurately and efficiently compute a broad class of statistics from low-depth whole-genome sequencing data, including estimates of their standard errors, by using the site frequency spectrum. In particular, we focus onf2and the sample covariance of allele frequencies to show how this method leads to accurate estimate of drift when fitting trees using qpGraph and TreeMix with low-depth data. However, the same considerations lead to uncertainty estimates for a variety of other statistics, including heterozygosity, kinship estimates (e.g. King), and quantities relating to genetic differentiation such asFstandDxy.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献