The health care and life sciences community profile for dataset descriptions

Author:

Dumontier Michel1,Gray Alasdair J.G.2,Marshall M. Scott3,Alexiev Vladimir4,Ansell Peter5,Bader Gary6,Baran Joachim1,Bolleman Jerven T.7,Callahan Alison1,Cruz-Toledo José8,Gaudet Pascale9,Gombocz Erich A.10,Gonzalez-Beltran Alejandra N.11,Groth Paul12,Haendel Melissa13,Ito Maori14,Jupp Simon15,Juty Nick15,Katayama Toshiaki16,Kobayashi Norio17,Krishnaswami Kalpana18,Laibe Camille15,Le Novère Nicolas19,Lin Simon20,Malone James15,Miller Michael21,Mungall Christopher J.22,Rietveld Laurens23,Wimalaratne Sarala M.15,Yamaguchi Atsuko16

Affiliation:

1. Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA, United States of America

2. Department of Computer Science, Heriot-Watt University, Edinburgh, United Kingdom

3. Department of Radiation Oncology (MAASTRO), GROW— School for Oncology and Developmental Biology, MAASTRO Clinic, Maastricht, Netherlands

4. Ontotext Corporation, Sofia, Bulgaria

5. CSIRO, Australia

6. The Donnelly Centre, University of Toronto, Toronto, Canada

7. Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Geneve, Switzerland

8. Carleton University, Canada

9. CALIPHO group, SIB Swiss Institute of Bioinformatics, Geneve, Switzerland

10. IO Informatics, Berkeley, CA, United States of America

11. Oxford e-Research Centre, University of Oxford, Oxford, Oxfordshire, United Kingdom

12. Elsevier Labs, Netherlands

13. Department of Medical Informatics and Epidemiology, Oregon Health Sciences University, Portland, OR, United States of America

14. Office of Medical Informatics and Epidemiology, Pharmaceuticals and Medical Devices Agency, Chiyoda-ku, Japan

15. EMBL, European Bioinformatics Institute, Saffron Walden, United Kingdom

16. Database Center for Life Science, Kashiwa, Japan

17. Advanced Center for Computing and Communication, RIKEN, Wako-shi, Saitama, Japan

18. Cerenode Inc., United States of America

19. The Babraham Institute, Cambridge, United Kingdom

20. Nationwide Children’s Hospital, Columbus, OH, United States of America

21. Institute for Systems Biology, Seattle, WA, United States of America

22. Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, United States of America

23. Department of Exact Sciences, VU University Amsterdam, Amsterdam, Netherlands

Abstract

Access to consistent, high-quality metadata is critical to finding, understanding, and reusing scientific data. However, while there are many relevant vocabularies for the annotation of a dataset, none sufficiently captures all the necessary metadata. This prevents uniform indexing and querying of dataset repositories. Towards providing a practical guide for producing a high quality description of biomedical datasets, the W3C Semantic Web for Health Care and the Life Sciences Interest Group (HCLSIG) identified Resource Description Framework (RDF) vocabularies that could be used to specify common metadata elements and their value sets. The resulting guideline covers elements of description, identification, attribution, versioning, provenance, and content summarization. This guideline reuses existing vocabularies, and is intended to meet key functional requirements including indexing, discovery, exchange, query, and retrieval of datasets, thereby enabling the publication of FAIR data. The resulting metadata profile is generic and could be used by other domains with an interest in providing machine readable descriptions of versioned datasets.

Funder

NIAID

Open PHACTS project and Innovative Medicines Initiative Joint Undertaking

US National Institutes of Health grant

Swiss Federal Government

BBSRC Institute Strategic Programme

Integrated Database Project

National Bioscience Database Center (NBDC—Japan)

Database Center for Life Sciences (DBCLS—Japan)

Publisher

PeerJ

Subject

General Agricultural and Biological Sciences,General Biochemistry, Genetics and Molecular Biology,General Medicine,General Neuroscience

Reference27 articles.

1. Describing linked datasets with the VoID vocabulary. Interest group note, W3C;Alexander,2011

2. Validata: an online tool for testing RDF data conformance;Baungard Hansen,2016

3. The ChEMBL bioactivity database: an update;Bento;Nucleic Acids Research,2014

4. Key words for use in RFCs to indicate requirement levels. Best current practice;Bradner,1997

5. RDF Schema 1.1. Recommendation, W3C;Brickley,2014

Cited by 17 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3