Affiliation:
1. Beijing Institute of Genomics, Chinese Academy of Sciences, and China National Center for Bioinformation , Beijing 100101 , China
2. University of Chinese Academy of Sciences , Beijing 100049 , China
Abstract
Abstract
To contain infectious diseases, it is crucial to determine the origin and transmission routes of the pathogen, as well as how the virus evolves. With the development of genome sequencing technology, genome epidemiology has emerged as a powerful approach for investigating the source and transmission of pathogens. In this study, we first presented the rationale for genomic tracing of SARS-CoV-2 and the challenges we currently face. Identifying the most genetically similar reference sequence to the query sequence is a critical step in genome tracing, typically achieved using either a phylogenetic tree or a sequence similarity search. However, these methods become inefficient or computationally prohibitive when dealing with tens of millions of sequences in the reference database, as we encountered during the COVID-19 pandemic. To address this challenge, we developed a novel genomic tracing algorithm capable of processing 6 million SARS-CoV-2 sequences in less than a minute. Instead of constructing a giant phylogenetic tree, we devised a weighted scoring system based on mutation characteristics to quantify sequences similarity. The developed method demonstrated superior performance compared to previous methods. Additionally, an online platform was developed to facilitate genomic tracing and visualization of the spatiotemporal distribution of sequences. The method will be a valuable addition to standard epidemiological investigations, enabling more efficient genomic tracing. Furthermore, the computational framework can be easily adapted to other pathogens, paving the way for routine genomic tracing of infectious diseases.
Funder
National Natural Science Foundation of China
Strategic Priority Research Program of the Chinese Academy of Sciences, China
Capital Health Development and Research Special Programme
Key Collaborative Research Program of the Alliance of International Science Organizations
Publisher
Oxford University Press (OUP)
Subject
Molecular Biology,Information Systems
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献