Principles for data analysis workflows-Reference-Cited by-同舟云学术

Principles for data analysis workflows

Published:2021-03-18 Issue:3 Volume:17 Page:e1008770
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Stoudt Sara^ORCID,Vásquez Váleri N.^ORCID,Martinez Ciera C.^ORCID

Abstract

A systematic and reproducible “workflow”—the process that moves a scientific investigation from raw data to coherent research question to insightful contribution—should be a fundamental part of academic data-intensive research practice. In this paper, we elaborate basic principles of a reproducible data analysis workflow by defining 3 phases: the Explore, Refine, and Produce Phases. Each phase is roughly centered around the audience to whom research decisions, methodologies, and results are being immediately communicated. Importantly, each phase can also give rise to a number of research products beyond traditional academic publications. Where relevant, we draw analogies between design principles and established practice in software development. The guidance provided here is not intended to be a strict rulebook; rather, the suggestions for practices and tools to advance reproducible, sound data-intensive analysis may furnish support for both students new to research and current researchers who are new to data-intensive work.

Publisher

Public Library of Science (PLoS)

Subject

Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics

Reference55 articles.

1. A preliminary review of influential works in data-driven discovery;M Stalzer;Springerplus,2016

2. Software engineering for scientific big data analysis;BA Grüning;Gigascience,2019

3. Robinson E, Nolis J. Build a Career in Data Science. Simon and Schuster; 2020.

4. A hypothesis is a liability;I Yanai;Genome Biol,2020

5. Designerly Ways of Knowing: Design Discipline Versus Design Science;N Cross;Design Issues,2001

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Replicability and reproducibility of data-intensive design research using workflows - example in facial expression synchrony as a measure of empathy;Journal of Engineering Design;2024-09

2. Ten simple rules for building and maintaining a responsible data science workflow;PLOS Computational Biology;2024-07-18

3. Initial data analysis for longitudinal studies to build a solid foundation for reproducible analysis;PLOS ONE;2024-05-29

4. Critical Review of Selected Analytical Platforms for GC-MS Metabolomics Profiling—Case Study: HS-SPME/GC-MS Analysis of Blackberry’s Aroma;Foods;2024-04-17

5. A survey of experimental stimulus presentation code sharing in major areas of psychology;Behavior Research Methods;2024-04-16