Structure-based knowledge acquisition from electronic lab notebooks for research data provenance documentation
-
Published:2022-01-31
Issue:1
Volume:13
Page:
-
ISSN:2041-1480
-
Container-title:Journal of Biomedical Semantics
-
language:en
-
Short-container-title:J Biomed Semant
Author:
Schröder Max, Staehlke Susanne, Groth Paul, Nebe J. Barbara, Spors Sascha, Krüger FrankORCID
Abstract
AbstractBackgroundElectronic Laboratory Notebooks (ELNs) are used to document experiments and investigations in the wet-lab. Protocols in ELNs contain a detailed description of the conducted steps including the necessary information to understand the procedure and the raised research data as well as to reproduce the research investigation. The purpose of this study is to investigate whether such ELN protocols can be used to create semantic documentation of the provenance of research data by the use of ontologies and linked data methodologies.MethodsBased on an ELN protocol of a biomedical wet-lab experiment, a retrospective provenance model of the raised research data describing the details of the experiment in a machine-interpretable way is manually engineered. Furthermore, an automated approach for knowledge acquisition from ELN protocols is derived from these results. This structure-based approach exploits the structure in the experiment’s description such as headings, tables, and links, to translate the ELN protocol into a semantic knowledge representation. To satisfy the Findable, Accessible, Interoperable, and Reuseable (FAIR) guiding principles, a ready-to-publish bundle is created that contains the research data together with their semantic documentation.ResultsWhile the manual modelling efforts serve as proof of concept by employing one protocol, the automated structure-based approach demonstrates the potential generalisation with seven ELN protocols. For each of those protocols, a ready-to-publish bundle is created and, by employing the SPARQL query language, it is illustrated that questions about the processes and the obtained research data can be answered.ConclusionsThe semantic documentation of research data obtained from the ELN protocols allows for the representation of the retrospective provenance of research data in a machine-interpretable way. Research Object Crate (RO-Crate) bundles including these models enable researchers to easily share the research data including the corresponding documentation, but also to search and relate the experiment to each other.
Funder
deutsche forschungsgemeinschaft Universität Rostock
Publisher
Springer Science and Business Media LLC
Subject
Computer Networks and Communications,Health Informatics,Computer Science Applications,Information Systems
Reference56 articles.
1. Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten J-W, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJG, Groth P, Goble C, Grethe JS, Heringa J, ’t Hoen PAC, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone S-A, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B. The FAIR guiding principles for scientific data management and stewardship. Sci Data. 2016; 3:160018. https://doi.org/10.1038/sdata.2016.18. 2. Jacobsen A, de Miranda Azevedo R, Juty N, Batista D, Coles S, Cornet R, Courtot M, Crosas M, Dumontier M, Evelo CT, Goble C, Guizzardi G, Hansen KK, Hasnain A, Hettne K, Heringa J, Hooft RWW, Imming M, Jeffery KG, Kaliyaperumal R, Kersloot MG, Kirkpatrick CR, Kuhn T, Labastida I, Magagna B, McQuilton P, Meyers N, Montesanti A, van Reisen M, Rocca-Serra P, Pergl R, Sansone S-A, da Silva Santos LOB, Schneider J, Strawn G, Thompson M, Waagmeester A, Weigel T, Wilkinson MD, Willighagen EL, Wittenburg P, Roos M, Mons B, Schultes E. FAIR principles: Interpretations and implementation considerations. Data Intell. 2020; 2(1-2):10–29. https://doi.org/10.1162/dint_r_00024. 3. Yu F, Zhou B, Lu T, Gu N. Research on data provenance model for multidisciplinary collaboration. In: Computer Supported Cooperative Work and Social Computing. Singapore: Springer: 2018. p. 32–49. https://doi.org/10.1007/978-981-13-3044-5_3. 4. Moreau L, Groth P. Provenance: An introduction to PROV. Synth Lect Semant Web Theory Technol. 2013; 3(4):1–129. https://doi.org/10.2200/s00528ed1v01y201308wbe007. 5. Belhajjame K, B’Far R, Cheney J, Coppens S, Cresswell S, Gil Y, Groth P, Klyne G, Lebo T, McCusker J, Miles S, Myers J, Sahoo S, Tilmes C. Prov-dm: The prov data model. Project report, World Wide Web Consortium. 2013. https://www.w3.org/TR/2013/REC-prov-dm-20130430/.
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|