Scaling‐up RADseq methods for large datasets of non‐invasive samples: Lessons for library construction and data preprocessing

Author:

Arantes Larissa S.12ORCID,Caccavo Jilda A.34ORCID,Sullivan James K.15ORCID,Sparmann Sarah16ORCID,Mbedi Susan17ORCID,Höner Oliver P.2ORCID,Mazzoni Camila J.12ORCID

Affiliation:

1. Berlin Center for Genomics in Biodiversity Research (BeGenDiv) Berlin Germany

2. Leibniz‐Institut für Zoo‐ und Wildtierforschung (IZW) Berlin Germany

3. Laboratoire des Sciences du Climat et de l’Environnement, LSCE/IPSL, CEA‐CNRS‐UVSQ Université Paris‐Saclay Gif‐sur‐Yvette France

4. Laboratoire d'Océanographie et du Climat: Expérimentations et Approches Numériques, LOCEAN/IPSL, UPMC‐CNRS‐IRD‐MNHN Sorbonne Université Paris France

5. Freie Universität Berlin Germany

6. Leibniz‐Institut für Gewässerökologie und Binnenfischerei (IGB) Berlin Germany

7. Museum für Naturkunde Berlin Germany

Abstract

AbstractGenetic non‐invasive sampling (gNIS) is a critical tool for population genetics studies, supporting conservation efforts while imposing minimal impacts on wildlife. However, gNIS often presents variable levels of DNA degradation and non‐endogenous contamination, which can incur considerable processing costs. Furthermore, the use of restriction‐site‐associated DNA sequencing methods (RADseq) for assessing thousands of genetic markers introduces the challenge of obtaining large sets of shared loci with similar coverage across multiple individuals. Here, we present an approach to handling large‐scale gNIS‐based datasets using data from the spotted hyena population inhabiting the Ngorongoro Crater in Tanzania. We generated 3RADseq data for more than a thousand individuals, mostly from faecal mucus samples collected non‐invasively and varying in DNA degradation and contamination level. Using small‐scale sequencing, we screened samples for endogenous DNA content, removed highly contaminated samples, confirmed overlap fragment length between libraries, and balanced individual representation in a sequencing pool. We evaluated the impact of (1) DNA degradation and contamination of non‐invasive samples, (2) PCR duplicates and (3) different SNP filters on genotype accuracy based on Mendelian error estimated for parent–offspring trio datasets. Our results showed that when balanced for sequencing depth, contaminated samples presented similar genotype error rates to those of non‐contaminated samples. We also showed that PCR duplicates and different SNP filters impact genotype accuracy. In summary, we showed the potential of using gNIS for large‐scale genetic monitoring based on SNPs and demonstrated how to improve control over library preparation by using a weighted re‐pooling strategy that considers the endogenous DNA content.

Publisher

Wiley

Subject

Genetics,Ecology, Evolution, Behavior and Systematics,Biotechnology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3