Abstract
AbstractRepeatable experiments with accurate data collection and reproducible analyses are fundamental to the scientific method but may be difficult to achieve in practice. Several flexible, open-source tools developed for the R and Python coding environments aid the reproducibility of data wrangling and analysis in scientific research. In contrast, analogous tools are generally lacking for earlier stages, such as systematic labelling and processing of field samples with hierarchical structure (e.g. time points of individuals from multiple lines or populations) or curating heterogenous data collected by different researchers over several years. Such tools are critical for modern research given trends toward globally distributed collaborators using higher-throughput technologies. As a step toward improving repeatability of methods for the collection of biological samples, and curation of biological data, we introduce the R package baRcodeR and the PyTrackDat pipeline in Python. The baRcodeR package provides tools for generating biologically informative, hierarchical labels with digitally encoded 2D barcodes that can be printed and scanned using low-cost commercial hardware. The PyTrackDat pipeline integrates with baRcodeR output to build a web interface for sample management and tracking along with data collection and curation. We briefly describe the application of principles from baRcodeR and PyTrackDat in three large research projects, which demonstrate their value to (i) help document sampling methods, (ii) facilitate collaboration and (iii) reduce opportunities for human errors and omissions that could otherwise propagate through downstream data analysis to compromise biological inference.
Publisher
Cold Spring Harbor Laboratory
Reference28 articles.
1. Understanding Environmental Complexity through a Distributed Knowledge Network
2. 1,500 scientists lift the lid on reproducibility;Nat. News,2016
3. Blagoderov, V. , Kitching, I.J. , Livermore, L. , Simonsen, T.J. , and Smith, V.S. (2012). No specimen left behind: industrial scale digitization of natural history collections. ZooKeys 133–146.
4. Some simple guidelines for effective data management;Bull. Ecol. Soc. Am.,2009
5. British Ecological Society (2014). A Guide to Data Management in Ecology and Evolution.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献