Author:
Huang Qiang,Qiu Huining,Bible Paul W.,Huang Yong,Zheng Fangfang,Gu Jing,Sun Jian,Hao Yuantao,Liu Yu
Abstract
BackgroundPrecise public health and clinical interventions for the COVID-19 pandemic has spurred a global rush on SARS-CoV-2 variant tracking, but current approaches to variant tracking are challenged by the flood of viral genome sequences leading to a loss of timeliness, accuracy, and reliability. Here, we devised a new co-mutation network framework, aiming to tackle these difficulties in variant surveillance.MethodsTo avoid simultaneous input and modeling of the whole large-scale data, we dynamically investigate the nucleotide covarying pattern of weekly sequences. The community detection algorithm is applied to a co-occurring genomic alteration network constructed from mutation corpora of weekly collected data. Co-mutation communities are identified, extracted, and characterized as variant markers. They contribute to the creation and weekly updates of a community-based variant dictionary tree representing SARS-CoV-2 evolution, where highly similar ones between weeks have been merged to represent the same variants. Emerging communities imply the presence of novel viral variants or new branches of existing variants. This process was benchmarked with worldwide GISAID data and validated using national level data from six COVID-19 hotspot countries.ResultsA total of 235 co-mutation communities were identified after a 120 weeks' investigation of worldwide sequence data, from March 2020 to mid-June 2022. The dictionary tree progressively developed from these communities perfectly recorded the time course of SARS-CoV-2 branching, coinciding with GISAID clades. The time-varying prevalence of these communities in the viral population showed a good match with the emergence and circulation of the variants they represented. All these benchmark results not only exhibited the methodology features but also demonstrated high efficiency in detection of the pandemic variants. When it was applied to regional variant surveillance, our method displayed significantly earlier identification of feature communities of major WHO-named SARS-CoV-2 variants in contrast with Pangolin's monitoring.ConclusionAn efficient genomic surveillance framework built from weekly co-mutation networks and a dynamic community-based variant dictionary tree enables early detection and continuous investigation of SARS-CoV-2 variants overcoming genomic data flood, aiding in the response to the COVID-19 pandemic.
Funder
Basic and Applied Basic Research Foundation of Guangdong Province
Subject
Public Health, Environmental and Occupational Health
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献