Reproducible acquisition, management, and meta-analysis of nucleotide sequence (meta)data using q2-fondue-Reference-Cited by-同舟云学术

Reproducible acquisition, management, and meta-analysis of nucleotide sequence (meta)data using q2-fondue

Published:2022-03-25 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ziemski Michal^ORCID,Adamov Anja^ORCID,Kim Lina^ORCID,Flörl Lena^ORCID,Bokulich Nicholas A.^ORCID

Abstract

AbstractThe volume of public nucleotide sequence data has blossomed over the past two decades, enabling novel discoveries via re-analysis, meta-analyses, and comparative studies for uncovering general biological trends. However, reproducible re-use and management of sequence datasets remains a challenge. We created the software plugin q2-fondue to enable user-friendly acquisition, re-use, and management of public nucleotide sequence (meta)data while adhering to open data principles. The software allows fully provenance-tracked programmatic access to and management of data from the Sequence Read Archive (SRA). Sequence data and accompanying metadata retrieved with q2-fondue follow a validated format, which is interoperable with the QIIME 2 ecosystem and its multiple user interfaces. To highlight the manifold capabilities of q2-fondue, we present several demonstration analyses using amplicon, whole genome, and shotgun metagenome datasets. These use cases demonstrate how q2-fondue increases analysis reproducibility and transparency from data download to final visualizations by including source details in the integrated provenance graph. We believe q2-fondue will lower existing barriers to comparative analyses of nucleotide sequence data, enabling more transparent, open, and reproducible conduct of meta-analyses. q2-fondue is a Python 3 package released under the BSD 3-clause license at https://github.com/bokulich-lab/q2-fondue.

Publisher

Cold Spring Harbor Laboratory

Reference53 articles.

1. Redondoviridae, a Family of Small, Circular DNA Viruses of the Human Oro-Respiratory Tract Associated with Periodontitis and Critical Illness

2. Comparative genomics as a tool to understand evolution and disease

3. Toward unrestricted use of public genomic data

4. 1,500 scientists lift the lid on reproducibility

5. Berman, F. , Wilkinson, R. , & Wood, J. (2014). Building Global Infrastructure for Data Sharing and Exchange Through the Research Data Alliance. D-Lib Magazine, 20(1/2). https://doi.org/10.1045/january2014-berman