Abstract
AbstractAccurate reconstruction of pedigrees from genetic data remains a challenging problem. Pedigree inference algorithms are often trained only on urban European-descent families, which are comparatively ‘outbred’ compared to many other global populations. Relationship categories can be difficult to distinguish (e.g. half-sibships versus avuncular) without external information. Furthermore, published software cannot accommodate endogamous populations where there may be reticulations within a pedigree or elevated haplotype sharing. We design a simple, rapid algorithm which initially uses only high-confidence first degree relationships to seed a machine learning step based on the number of identical by descent segments. Additionally, we define a new statistic to polarize individuals to ancestor versus descendant generation. We test our approach in a sample of 700 individuals from northern Namibia, sampled from an endogamous population. Due to a culture of concurrent relationships in this population, there is a high proportion of half-sibships. We accurately identify first through third degree relationships for all categories, including half-sibships, half-avuncular-ships etc. We further validate our approach in the Barbados Asthma Genetics Study (BAGS) dataset. Accurate reconstruction of pedigrees holds promise for tracing allele frequency trajectories, improved phasing and other population genomic questions.
Publisher
Cold Spring Harbor Laboratory
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献