Abstract
AbstractThe fields of viral ecology and evolution have rapidly expanded in the last two decades, driven by technological improvements, and motivated by efforts to discover potentially zoonotic wildlife viruses under the rubric of pandemic prevention. One consequence has been a massive proliferation of host-virus association data, which comprise the backbone of research in viral macroecology and zoonotic risk prediction. These data remain fragmented across numerous data portals and projects, each with their own scope, structure, and reporting standards. Here, we propose that synthesis of host-virus association data is a central challenge to improve our understanding of the global virome and develop foundational theory in viral ecology. To illustrate this, we build an open reconciled mammal-virus database from four key published datasets, applying a standardized taxonomy and metadata. We show that reconciling these datasets provides a substantially richer view of the mammal virome than that offered by any one individual database. We argue for a shift in best practice towards the incremental development and use of synthetic datasets in viral ecology research, both to improve comparability and replicability across studies, and to facilitate future efforts to use machine learning to predict the structure and dynamics of the global virome.
Publisher
Cold Spring Harbor Laboratory
Reference49 articles.
1. Predicting the global mammalian viral sharing network using phylogeography;Nature communications,2020
2. Predicting reservoir hosts and arthropod vectors from evolutionary signatures in RNA virus genomes
3. Becker DJ , Albery GF , Sjodin AR , Poisot T , Dallas TA , Eskew EA , Farrell MJ , Guth S , Han BA , Simmons NB , Stock M , Teeling EC , Carlson CJ . Predicting wildlife hosts of betacoronaviruses for SARS-CoV-2 sampling prioritization: a modeling study.
4. Beyond Infection: Integrating Competence into Reservoir Host Prediction;Trends in Ecology & Evolution,2020
5. The delayed rise of present-day mammals