Mobilisation and analyses of publicly available SARS-CoV-2 data for pandemic responses
Author:
Rahman Nadim1ORCID, O'Cathail Colman1ORCID, Zyoud Ahmad1ORCID, Sokolov Alexey1ORCID, Oude Munnink Bas2ORCID, Grüning Björn3ORCID, Cummins Carla1ORCID, Amid Clara2ORCID, Nieuwenhuijse David F.2ORCID, Visontai Dávid4, Yuan David Yu1ORCID, Gupta Dipayan1ORCID, Prasad Divyae K.2ORCID, Gulyás Gábor Máté5, Rinck Gabriele1ORCID, McKinnon Jasmine1ORCID, Rajan Jeena1ORCID, Knaggs Jeff1, Skiby Jeffrey Edward5, Stéger József4ORCID, Szarvas Judit5ORCID, Gueye Khadim1ORCID, Papp Krisztián4ORCID, Hoek Maarten2, Kumar Manish1, Ventouratou Marianna A.1ORCID, Bouquieaux Marie-Catherine2, Koliba Martin5, Mansurova Milena1ORCID, Haseeb Muhammad1ORCID, Worp Nathalie2ORCID, Harrison Peter W.1ORCID, Leinonen Rasko1ORCID, Thorne Ross1ORCID, Selvakumar Sandeep1ORCID, Hunt Sarah1ORCID, Venkataraman Sundar1ORCID, Jayathilaka Suran1ORCID, Cezard Timothée1ORCID, Maier Wolfgang3ORCID, Waheed Zahra1ORCID, Iqbal Zamin1ORCID, Aarestrup Frank Møller5ORCID, Csabai Istvan4ORCID, Koopmans Marion2ORCID, Burdett Tony1ORCID, Cochrane Guy1ORCID
Affiliation:
1. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, UK 2. Erasmus Medical Center, Wytemaweg 80, 3015 CN Rotterdam, Netherlands 3. University of Freiburg, Friedrichstr. 39, 79098 Freiburg, Germany 4. Eötvös Loránd University, H-1053 Budapest, Egyetem tér 1-3, Hungary 5. Technical University of Denmark, Anker Engelunds Vej 101, 2800 Kongens Lyngby, Denmark
Abstract
The COVID-19 pandemic has seen large-scale pathogen genomic sequencing efforts, becoming part of the toolbox for surveillance and epidemic research. This resulted in an unprecedented level of data sharing to open repositories, which has actively supported the identification of SARS-CoV-2 structure, molecular interactions, mutations and variants, and facilitated vaccine development and drug reuse studies and design. The European COVID-19 Data Platform was launched to support this data sharing, and has resulted in the deposition of several million SARS-CoV-2 raw reads. In this paper we describe (1) open data sharing, (2) tools for submission, analysis, visualisation and data claiming (e.g. ORCiD), (3) the systematic analysis of these datasets, at scale via the SARS-CoV-2 Data Hubs as well as (4) lessons learnt. This paper describes a component of the Platform, the SARS-CoV-2 Data Hubs, which enable the extension and set up of infrastructure that we intend to use more widely in the future for pathogen surveillance and pandemic preparedness.
Publisher
Microbiology Society
|
|