Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance-Reference-Cited by-同舟云学术

Towards increased accuracy and reproducibility in SARS-CoV-2 next generation sequence analysis for public health surveillance

Published:2022-11-03 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Connor Ryan^ORCID,Yarmosh David A.^ORCID,Maier Wolfgang^ORCID,Shakya Migun^ORCID,Martin Ross,Bradford Rebecca,Brister J. Rodney^ORCID,Chain Patrick SG^ORCID,Copeland Courtney A.^ORCID,di Iulio Julia^ORCID,Hu Bin,Ebert Philip^ORCID,Gunti Jonathan,Jin Yumi,Katz Kenneth S.^ORCID,Kochergin Andrey,LaRosa Tré,Li Jiani,Li Po-E^ORCID,Lo Chien-Chi^ORCID,Rashid Sujatha^ORCID,Maiorova Evguenia S.,Xiao Chunlin^ORCID,Zalunin Vadim^ORCID,Pruitt Kim D.^ORCID

Abstract

AbstractDuring the COVID-19 pandemic, SARS-CoV-2 surveillance efforts integrated genome sequencing of clinical samples to identify emergent viral variants and to support rapid experimental examination of genome-informed vaccine and therapeutic designs. Given the broad range of methods applied to generate new viral genomes, it is critical that consensus and variant calling tools yield consistent results across disparate pipelines. Here we examine the impact of sequencing technologies (Illumina and Oxford Nanopore) and 7 different downstream bioinformatic protocols on SARS-CoV-2 variant calling as part of the NIH Accelerating COVID-19 Therapeutic Interventions and Vaccines (ACTIV) Tracking Resistance and Coronavirus Evolution (TRACE) initiative, a public-private partnership established to address the COVID-19 outbreak. Our results indicate that bioinformatic workflows can yield consensus genomes with different single nucleotide polymorphisms, insertions, and/or deletions even when using the same raw sequence input datasets. We introduce the use of a specific suite of parameters and protocols that greatly improves the agreement among pipelines developed by diverse organizations. Such consistency among bioinformatic pipelines is fundamental to SARS-CoV-2 and future pathogen surveillance efforts. The application of analysis standards is necessary to more accurately document phylogenomic trends and support data-driven public health responses.

Publisher

Cold Spring Harbor Laboratory

Reference53 articles.

1. GenBank

2. The Sequence Read Archive: a decade more of explosive growth

3. Pan, B. et al. Assessing reproducibility of inherited variants detected with short-read whole genome sequencing. Genome Biol. 23, (2022).

4. Performance assessment of DNA sequencing platforms in the ABRF Next-Generation Sequencing Study

5. Krishnan, V. et al. Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays. BMC Bioinformatics 22, (2021).

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Two decades of population genomics: will we ever agree on bacterial species?;BMC Biology;2024-01-26

2. Putting everything in its place: using the INSDC compliant Pathogen Data Object Model to better structure genomic data submitted for public health applications;Microbial Genomics;2023-12-12

3. Database resources of the National Center for Biotechnology Information;Nucleic Acids Research;2023-11-22

4. We All Know Standardization Is Key, But How Do We Get There with Clinical Metagenomics?;Clinical Chemistry;2023-08-03