Abstract
AbstractAccurately estimating the effective reproduction number (Rt) of a circulating pathogen is a fundamental challenge in the study of infectious disease. The fields of epidemiology and pathogen phylodynamics both share this goal, but to date, methodologies and data employed by each remain largely distinct. Here we present EpiFusion: a joint approach that can be used to harness the complementary strengths of each field to improve estimation of outbreak dynamics for large and poorly sampled epidemics, such as arboviral or respiratory outbreaks, and validate it for retrospective analysis. We propose a model of Rt that estimates outbreak trajectories conditional upon both phylodynamic (time-scaled trees estimated from genetic sequences) and epidemiological (case incidence) data. We simulate stochastic outbreak trajectories that are weighted according to epidemiological and phylodynamic observation models and fit using particle Markov Chain Monte Carlo. To assess performance, we test EpiFusion on simulated outbreaks in which transmission and/or surveillance rapidly changes and find that using EpiFusion to combine epidemiological and phylodynamic data maintains accuracy and increases certainty in trajectory and Rt estimates, compared to when each data type is used alone. Finally, we benchmark EpiFusion’s performance against existing methods to estimate Rt and demonstrate advances in efficiency and accuracy. Importantly, our approach scales efficiently with dataset size, including the use of phylogenetic trees generated from large genomic datasets. EpiFusion is designed to accommodate future extensions that will improve its utility, such as introduction of population structure, accommodations for phylogenetic uncertainty, and the ability to weight the contributions of genomic or case incidence to the inference.Author SummaryUnderstanding infectious disease spread is fundamental to protecting public health, but can be challenging as disease spread is a phenomenon that cannot be directly observed. So, epidemiologists use data in conjunction with mathematical models to estimate disease dynamics. Often, combinations of different models and data can be used to answer the same questions – for example ‘traditional’ epidemiology commonly uses case incidence data (the number of people who have tested positive for a disease at a certain time) whereas phylodynamic models use pathogen genomic sequence data and our knowledge of their evolution to model disease population dynamics. Each of these approaches have strengths and limitations, and data of each type can be sparse or biased, particularly in rapidly developing outbreaks or lower-middle income countries. An increasing number of approaches attempt to fix this problem by incorporating diverse concepts and data types together in their models. We aim to contribute to this movement by introducing EpiFusion, a modelling framework that makes improvements on efficiency and temporal resolution. EpiFusion uses particle filtering to simulate epidemic trajectories over time and weight their likelihood according to both case incidence data and a phylogenetic tree using separate observation models, resulting in the inference of trajectories in agreement with both sets of data. Improvements in our ability to accurately and confidently model pathogen spread help us to respond to infectious disease outbreaks and improve public health.
Publisher
Cold Spring Harbor Laboratory