Abstract
AbstractThe GISAID database contains more than 100,000 SARS-CoV-2 genomes, including sequences of the recently discovered SARS-CoV-2 omicron variant and of prior SARS-CoV-2 strains that have been collected from patients around the world since the beginning of the pandemic. We applied unsupervised cluster analysis to the SARS-CoV-2 genomes, assessing their similarity at a genome-wide level based on the Jaccard index and principal component analysis. Our analysis results show that the omicron variant sequences are most similar to sequences that have been submitted early in the pandemic around January 2020. Furthermore, the omicron variants in GISAID are spread across the entire range of the first principal component, suggesting that the strain has been in circulation for some time. This observation supports a long-term infection hypothesis as the omicron strain origin.
Publisher
Cold Spring Harbor Laboratory
Reference15 articles.
1. Dolgin, E. (2021). Omicron is supercharging the COVID vaccine booster debate. Nature, https://doi.org/10.1038/d41586-021-03592-2
2. Chertow D. , Stein S. , Ramelli S. , Grazioli A. , Chung J.-Y. , Singh M. , Yinda C.K. , Winkler C. , Dickey J. , Ylaya K. , Ko S.H. , Platt A. , Burbelo P. Quezado M. , Pittaluga S. , Purcell M. , Munster V. , Belinky F. , Ramos-Benitez M. , Boritz E. , Herr D. , Rabin J. , Saharia K. , Madathil R. , Tabatabai A. , Soherwardi S. , McCurdy M. , Peterson K. , Cohen J. , de Wit E. , Vannella K. , Hewitt S. , and Kleiner D. (2021). SARS-CoV-2 infection and persistence throughout the human body and brain. https://doi.org/10.21203/rs.3.rs-1139035/v1
3. Data, disease and diplomacy: GISAID’s innovative contribution to global health;Global Challenges,2017
4. Hahn G. , Lee S. , Weiss S.T. , and Lange C. (2020). Unsupervised cluster analysis of SARS-CoV-2 genomes indicates that recent (June 2020) cases in Beijing are from a genetic subgroup that consists of mostly European and South(east) Asian samples, of which the latter are the most recent. bioRxiv, pages 1–8, https://doi.org/10.1101/2020.06.22.165936
5. Unsupervised cluster analysis of SARS-CoV-2 genomes reflects its geographic progression and identifies distinct genetic subgroups of SARS-CoV-2 virus;Genet Epidemiol,2020
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献