Capturing provenance information for biomedical data and workflows: A scoping review

Author:

Gierend Kerstin1,Krüger Frank2,Genehr Sascha3,Hartmann Francisca1,Siegel Fabian1,Waltemath Dagmar4,Ganslandt Thomas5,Zeleke Atinkut Alamirrew4

Affiliation:

1. Department of Biomedical Informatics at the Center for Preventive Medicine and Digital Health, Medical Faculty Mannheim, Heidelberg University, Mannheim

2. Department of Electrical Engineering and Computer Science, Faculty of Engineering, Wismar University of Applied Sciences

3. Department of Communications Engineering, University of Rostock

4. Department of Medical Informatics, University Medicine Greifswald

5. Chair of Medical Informatics, Friedrich-Alexander-Universität Erlangen-Nürnberg

Abstract

Abstract Background: Provenance enriched scientific results ensure their reproducibility and trustworthiness, particularly when containing sensitive data. Provenance information leads to higher interpretability of scientific results and enables reliable collaboration and data sharing. However, the lack of comprehensive evidence on provenance approaches hinders the uptake of good scientific practice in clinical research. Our scoping review identifies evidence regarding approaches and criteria for provenance tracking in the biomedical domain. We investigate the state-of-the-art frameworks, associated artifacts, and methodologies for provenance tracking. Methods: This scoping review followed the methodological framework by Arksey and O'Malley. PubMed and Web of Science databases were searched for English-language articles published from January 1, 2006, to March 23, 2021. Title and abstract screening were carried out by four independent reviewers using the Rayyan screening tool. A majority vote was required for consent on the eligibility of papers based on the defined inclusion and exclusion criteria. Full-text reading and screening were performed independently by two reviewers, and information was extracted into a pre-tested template for the five research questions. Disagreements were resolved by a domain expert. The study protocol has previously been published. Results: The search resulted in a total of 564 papers. Of 469 identified, de-duplicated papers, 54 studies fulfilled the inclusion criteria and were subjected to five research questions. The review identified the heterogeneous tracking approaches, their artifacts, and varying degrees of fulfillment of the research questions. Based on this, we developed a roadmap for a tailor-made provenance framework considering the software life cycle. Conclusions: In this paper we investigate the state-of-the-art frameworks, associated artifacts, and methodologies for provenance tracking including real-life applications. We observe that most authors imply ideal conditions for provenance tracking. However, our analysis discloses several gaps for which we illustrate future steps toward a systematic provenance strategy. We believe the recommendations enforce quality and guide the implementation of auditable and measurable provenance approaches as well as solutions in the daily routine of biomedical scientists.

Publisher

Research Square Platform LLC

Reference76 articles.

1. Implementing interoperable provenance in biomedical research;Curcin V;Future Generation Computer Systems,2014

2. A semantic proteomics dashboard (SemPoD) for data management in translational research;Jayapandian CP;BMC Syst Biol,2012

3. Cuggia M, Combes S. The French Health Data Hub and the German Medical Informatics Initiatives: Two National Projects to Promote Data Sharing in Healthcare. Yearb Med Inform. 2019;28:195–202.

4. Embedding data provenance into the Learning Health System to facilitate reproducible research;Curcin V;Learn Health Syst,2017

5. How the Provenance of Electronic Health Record Data Matters for Research: A Case Example Using System Mapping;Johnson KE;eGEMs,2014

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3