Abstract
Abstract
Background
Low-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce to analyze population structure of low-depth sequencing data.
Results
The method optimizes the choice of nonlinear transformations of dosages to maximize the Ky Fan norm of the covariance matrix. The transformation incorporates the uncertainty in calling between heterozygotes and the common homozygotes for loci having a rare allele and is more linear when both variants are common.
Conclusions
We apply to samples from two indigenous Siberian populations and reveal hidden population structure accurately using only a single chromosome. The package is available on https://github.com/yiwenstat/MCPCA_PopGen.
Funder
National Institute of General Medical Sciences
National Human Genome Research Institute
National Institute of Diabetes and Digestive and Kidney Diseases
National Science Foundation
National Heart, Lung, and Blood Institute
Directorate for Mathematical and Physical Sciences
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献