Using AnnoTree to get more assignments, faster, in DIAMOND+MEGAN microbiome analysis-Reference-Cited by-同舟云学术

Using AnnoTree to get more assignments, faster, in DIAMOND+MEGAN microbiome analysis

Published:2021-11-24 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Gautam Anupam,Felderhoff Hendrik,Bağci Caner,Huson Daniel H.

Abstract

ABSTRACTIn microbiome analysis, one main approach is to align metagenomic sequencing reads against a protein-reference database such as NCBI-nr, and then to perform taxonomic and functional binning based on the alignments. This approach is embodied, for example, in the standard DIAMOND+MEGAN analysis pipeline, which first aligns reads against NCBI-nr using DIAMOND and then performs taxonomic and functional binning using MEGAN. Here we propose the use of the AnnoTree protein database, rather than NCBI-nr, in such alignment-based analyses to determine the prokaryotic content of metagenomic samples. We demonstrate a 2-fold speedup over the usage of the prokaryotic part of NCBI-nr, and increased assignment rates, in particular, assigning twice as many reads to KEGG. In addition to binning to the NCBI taxonomy, MEGAN now also bins to the GTDB taxonomy.IMPORTANCEThe NCBI-nr database is not explicitly designed for the purpose of microbiome analysis and its increasing size makes its unwieldy and computationally expensive for this purpose. The AnnoTree protein database is only one quarter the size of the full NCBI-nr database and is explicitly designed for metagenomic analysis, and so should be supported by alignment-based pipelines.

Publisher

Cold Spring Harbor Laboratory

Reference45 articles.

1. At the forefront of the sequencing revolution—notes from the RNGS19 conference;Genome biology,2019

2. The Next-Generation Sequencing Revolution and Its Impact on Genomics

3. Metagenomics: Application of Genomics to Uncultured Microorganisms

4. Uncovering the novel characteristics of Asian honey bee, Apis cerana, by whole genome sequencing

5. RNA virus interference via CRISPR/Cas13a system in plants