The European Nucleotide Archive in 2023
Author:
Yuan David1ORCID, Ahamed Alisha1, Burgin Josephine1ORCID, Cummins Carla1ORCID, Devraj Rajkumar1, Gueye Khadim1, Gupta Dipayan1, Gupta Vikas1ORCID, Haseeb Muhammad1, Ihsan Maira1, Ivanov Eugene1, Jayathilaka Suran1, Kadhirvelu Vishnukumar Balavenkataraman1, Kumar Manish1, Lathi Ankur1ORCID, Leinonen Rasko1ORCID, McKinnon Jasmine1ORCID, Meszaros Lili1, O’Cathail Colman1ORCID, Ouma Dennis1, Paupério Joana1ORCID, Pesant Stephane1, Rahman Nadim1, Rinck Gabriele1ORCID, Selvakumar Sandeep1, Suman Swati1, Sunthornyotin Yanisa1, Ventouratou Marianna1, Vijayaraja Senthilnathan1, Waheed Zahra1, Woollard Peter1, Zyoud Ahmad1, Burdett Tony1ORCID, Cochrane Guy1ORCID
Affiliation:
1. European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus , Hinxton, Cambridge CB10 1SD, UK
Abstract
Abstract
The European Nucleotide Archive (ENA; https://www.ebi.ac.uk/ena) is maintained by the European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI). The ENA is one of the three members of the International Nucleotide Sequence Database Collaboration (INSDC). It serves the bioinformatics community worldwide via the submission, processing, archiving and dissemination of sequence data. The ENA supports data types ranging from raw reads, through alignments and assemblies to functional annotation. The data is enriched with contextual information relating to samples and experimental configurations. In this article, we describe recent progress and improvements to ENA services. In particular, we focus upon three areas of work in 2023: FAIRness of ENA data, pandemic preparedness and foundational technology. For FAIRness, we have introduced minimal requirements for spatiotemporal annotation, created a metadata-based classification system, incorporated third party metadata curations with archived records, and developed a new rapid visualisation platform, the ENA Notebooks. For foundational enhancements, we have improved the INSDC data exchange and synchronisation pipelines, and invested in site reliability engineering for ENA infrastructure. In order to support genomic surveillance efforts, we have continued to provide ENA services in support of SARS-CoV-2 data mobilisation and have adapted these for broader pathogen surveillance efforts.
Funder
European Molecular Biology Laboratory Gordon and Betty Moore Foundation Aquatic Symbiosis UniEuk European Union's Horizon 2020 and Horizon Europe research and innovation programmes Aqa-FAANG AtlantECO BiCIKL BioOcean5D BlueCloud Blue-Cloud 2026 BovReg BGE BY-COVID EarlyCause EASI-Genomics eDNAqua-Plan ELIXIR-CONVERGE EOSC-Life GENE-SwitCh RECODID VEO Biotechnology and Biological Sciences Research Council Wellcome Trust SP3
Publisher
Oxford University Press (OUP)
Reference8 articles.
1. The FAIR Guiding Principles for scientific data management and stewardship;Wilkinson;Sci. Data,2016 2. The international nucleotide sequence database collaboration;Arita;Nucleic Acids Res.,2021 3. GenBank;Sayers;Nucleic Acids Res.,2021 4. DDBJ database updates and computational infrastructure enhancement;Ogasawara;Nucleic Acids Res.,2019 5. The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences;Drysdale;Bioinformatics,2020
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|