Benchmarking kinship estimation tools for ancient genomes using pedigree simulations

Author:

Aktürk Şevval1ORCID,Mapelli Igor1ORCID,Güler Merve N.1ORCID,Gürün Kanat1ORCID,Katırcıoğlu Büşra1ORCID,Vural Kıvılcım Başak1ORCID,Sağlıcan Ekin2ORCID,Çetin Mehmet1,Yaka Reyhan134ORCID,Sürer Elif5ORCID,Atağ Gözde1ORCID,Çokoğlu Sevim Seda1ORCID,Sevkar Arda6ORCID,Altınışık N. Ezgi6ORCID,Koptekin Dilek2ORCID,Somel Mehmet1ORCID

Affiliation:

1. Department of Biological Sciences Middle East Technical University Ankara Turkey

2. Department of Health Informatics, Graduate School of Informatics Middle East Technical University Ankara Turkey

3. Centre for Palaeogenetics Stockholm Sweden

4. Department of Archaeology and Classical Studies Stockholm University Stockholm Sweden

5. Department of Modeling and Simulation, Graduate School of Informatics Middle East Technical University Ankara Turkey

6. Department of Anthropology Hacettepe University Ankara Turkey

Abstract

AbstractThere is growing interest in uncovering genetic kinship patterns in past societies using low‐coverage palaeogenomes. Here, we benchmark four tools for kinship estimation with such data: lcMLkin, NgsRelate, KIN, and READ, which differ in their input, IBD estimation methods, and statistical approaches. We used pedigree and ancient genome sequence simulations to evaluate these tools when only a limited number (1 to 50 K, with minor allele frequency ≥0.01) of shared SNPs are available. The performance of all four tools was comparable using ≥20 K SNPs. We found that first‐degree related pairs can be accurately classified even with 1 K SNPs, with 85% F1 scores using READ and 96% using NgsRelate or lcMLkin. Distinguishing third‐degree relatives from unrelated pairs or second‐degree relatives was also possible with high accuracy (F1 > 90%) with 5 K SNPs using NgsRelate and lcMLkin, while READ and KIN showed lower success (69 and 79% respectively). Meanwhile, noise in population allele frequencies and inbreeding (first‐cousin mating) led to deviations in kinship coefficients, with different sensitivities across tools. We conclude that using multiple tools in parallel might be an effective approach to achieve robust estimates on ultra‐low‐coverage genomes.

Publisher

Wiley

Reference71 articles.

1. A community-maintained standard library of population genetic models

2. Aktürk Ş. Mapelli I. Güler M. N. &Somel M.(2023).Simulated Ancient Genomic Kinship Dataset: VCF and BAM (1×) Files for Related (including inbred) Pairs (1.0).Zenodo.https://doi.org/10.5281/zenodo.10070958

3. READv2: Advanced and user-friendly detection of biological relatedness in archaeogenomics

4. Altınışık E.(2023).lcMLkin v2.1. Github.https://github.com/altinisik/lcMLkin‐v2.1

5. A genomic snapshot of demographic and cultural dynamism in Upper Mesopotamia during the Neolithic Transition

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3