Author:
Shen Lishuang,Maglinte Dennis,Ostrow Dejerianne,Pandey Utsav,Bootwalla Moiz,Ryutov Alex,Govindarajan Ananthanarayanan,Ruble David,Han Jennifer,Triche Timothy J.,Bard Jennifer Dien,Biegel Jaclyn A.,Judkins Alexander R.,Gai Xiaowu
Abstract
AbstractEffective response to the Coronavirus Disease 2019 (COVID-19) pandemic requires genomic resources and bioinformatics tools for genomic epidemiology and surveillance studies that involve characterizing full-length viral genomes, identifying origins of infections, determining the relatedness of viral infections, performing phylogenetic analyses, and monitoring the continuous evolution of the SARS-CoV-2 viral genomes. The Children’s Hospital, Los Angeles (CHLA) COVID-19 Analysis Research Database (CARD) (https://covid19.cpmbiodev.net/) is a comprehensive genomic resource that provides access to full-length SARS-CoV-2 viral genomes and associated meta-data for over 30,000 (as of May 20, 2020) isolates collected from global sequencing repositories and the sequencing performed at the Center for Personalized Medicine (CPM) at CHLA. Reference phylogenetic trees of global and USA viral isolates were constructed and are periodically updated using selected high quality SARS-CoV-2 genome sequences. These provide the baseline and analytical context for identifying the origin of a viral infection, as well as the relatedness of SARS-CoV-2 genomes of interest. A web-based and interactive Phylogenetic Tree Browser supports flexible tree manipulation and advanced analysis based on keyword search while highlighting time series animation, as well as subtree export for graphical representation or offline exploration. A Virus Genome Tracker accepts complete or partial SARS-CoV-2 genome sequence, compares it against all available sequences in the database (>30,000 at time of writing), detects and annotates the variants, and places the new viral isolate within the global or USA phylogenetic contexts based upon variant profiles and haplotype comparisons, in a few seconds. The generated analysis can potentially aid in genomic surveillance to trace the transmission of any new infection. Using CHLA CARD, we demonstrate the identification of a candidate outbreak point where 13 of 31 CHLA internal isolates may have originated. We also discovered multiple indels of unknown clinical significance in the orf3a gene, and revealed a number of USA-specific variants and haplotypes.
Publisher
Cold Spring Harbor Laboratory