Abstract
AbstractBiomedical datasets are increasing in size, stored in many repositories, and face challenges in FAIRness (findability, accessibility, interoperability, reusability). As a Consortium of infectious disease researchers from 15 Centers, we aim to adopt open science practices to promote transparency, encourage reproducibility, and accelerate research advances through data reuse. To improve FAIRness of our datasets and computational tools, we evaluated metadata standards across established biomedical data repositories. The vast majority do not adhere to a single standard, such as Schema.org, which is widely-adopted by generalist repositories. Consequently, datasets in these repositories are not findable in aggregation projects like Google Dataset Search. We alleviated this gap by creating a reusable metadata schema based on Schema.org and catalogued nearly 400 datasets and computational tools we collected. The approach is easily reusable to create schemas interoperable with community standards, but customized to a particular context. Our approach enabled data discovery, increased the reusability of datasets from a large research consortium, and accelerated research. Lastly, we discuss ongoing challenges with FAIRness beyond discoverability.
Funder
U.S. Department of Health & Human Services | NIH | National Institute of Allergy and Infectious Diseases
U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences
U.S. Department of Health & Human Services | NIH | National Center for Advancing Translational Sciences
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability
Reference92 articles.
1. Siebert, M. et al. Data-sharing recommendations in biomedical journals and randomised controlled trials: an audit of journals following the ICMJE recommendations. BMJ Open 10, e038887 (2020).
2. Springer Nature Data Availability Statements. Springer Nature https://www.springernature.com/gp/authors/research-data-policy/data-availability-statements/12330880.
3. Science Data and Code Deposition Policy. Science Journals: editorial policies https://www.science.org/content/page/science-journals-editorial-policies.
4. The EMBO Journal: Author Guidelines. https://www.embopress.org/page/journal/14602075/authorguide 10.1002/(ISSN)1460-2075.
5. Information for Authors: Cell. https://www.cell.com/cell/authors.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献