Abstract
AbstractHigh throughput sequencing enables rapid genome sequencing during infectious disease outbreaks and provides an opportunity to quantify the evolutionary dynamics of pathogens in near real-time. One difficulty of undertaking evolutionary analyses over short timescales is the dependency of the inferred evolutionary parameters on the timespan of observation. Here, we characterise the molecular evolutionary dynamics of SARS-CoV-2 and 2009 pandemic H1N1 (pH1N1) influenza during the first 12 months of their respective pandemics. We use Bayesian phylogenetic methods to estimate the dates of emergence, evolutionary rates, and growth rates of SARS-CoV-2 and pH1N1 over time and investigate how varying sampling window and dataset sizes affects the accuracy of parameter estimation. We further use a generalised McDonald-Kreitman test to estimate the number of segregating non-neutral sites over time. We find that the inferred evolutionary parameters for both pandemics are time-dependent, and that the inferred rates of SARS-CoV-2 and pH1N1 decline by ∼50% and ∼100%, respectively, over the course of one year. After at least 4 months since the start of sequence sampling, inferred growth rates and emergence dates remain relatively stable and can be inferred reliably using a logistic growth coalescent model. We show that the time-dependency of the mean substitution rate is due to elevated substitution rates at terminal branches which are 2-4 times higher than those of internal branches for both viruses. The elevated rate at terminal branches is strongly correlated with an increasing number of segregating non-neutral sites, demonstrating the role of purifying selection in generating the time-dependency of evolutionary parameters during pandemics.
Publisher
Cold Spring Harbor Laboratory