PPIT: an R package for inferring microbial taxonomy from nifH sequences-Reference-Cited by-同舟云学术

PPIT: an R package for inferring microbial taxonomy from nifH sequences

Published:2021-02-13 Issue:16 Volume:37 Page:2289-2298
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Kapili Bennett J¹^ORCID,Dekas Anne E¹^ORCID

Affiliation:

1. Department of Earth System Science, Stanford University, Stanford, CA 94305, USA

Abstract

Abstract Motivation Linking microbial community members to their ecological functions is a central goal of environmental microbiology. When assigned taxonomy, amplicon sequences of metabolic marker genes can suggest such links, thereby offering an overview of the phylogenetic structure underpinning particular ecosystem functions. However, inferring microbial taxonomy from metabolic marker gene sequences remains a challenge, particularly for the frequently sequenced nitrogen fixation marker gene, nitrogenase reductase (nifH). Horizontal gene transfer in recent nifH evolutionary history can confound taxonomic inferences drawn from the pairwise identity methods used in existing software. Other methods for inferring taxonomy are not standardized and require manual inspection that is difficult to scale. Results We present Phylogenetic Placement for Inferring Taxonomy (PPIT), an R package that infers microbial taxonomy from nifH amplicons using both phylogenetic and sequence identity approaches. After users place query sequences on a reference nifH gene tree provided by PPIT (n = 6317 full-length nifH sequences), PPIT searches the phylogenetic neighborhood of each query sequence and attempts to infer microbial taxonomy. An inference is drawn only if references in the phylogenetic neighborhood are: (1) taxonomically consistent and (2) share sufficient pairwise identity with the query, thereby avoiding erroneous inferences due to known horizontal gene transfer events. We find that PPIT returns a higher proportion of correct taxonomic inferences than BLAST-based approaches at the cost of fewer total inferences. We demonstrate PPIT on deep-sea sediment and find that Deltaproteobacteria are the most abundant potential diazotrophs. Using this dataset, we show that emending PPIT inferences based on visual inspection of query sequence placement can achieve taxonomic inferences for nearly all sequences in a query set. We additionally discuss how users can apply PPIT to the analysis of other marker genes. Availability and implementation PPIT is freely available to noncommercial users at https://github.com/bkapili/ppit. Installation includes a vignette that demonstrates package use and reproduces the nifH amplicon analysis discussed here. The raw nifH amplicon sequence data have been deposited in the GenBank, EMBL and DDBJ databases under BioProject number PRJEB37167. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

National Science Foundation

Graduate Research Fellowship

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btab100/39648251/btab100.pdf

Reference57 articles.

1. Binning metagenomic contigs by coverage and composition;Alneberg;Nat. Methods,2014

2. Deblur rapidly resolves single-nucleotide community sequence patterns;Amir;Am. Soc. Microbiol,2017

3. Evaluation of primers targeting the diazotroph functional gene and development of NifMAP—a bioinformatics pipeline for analyzing nifH amplicon data;Angel;Front. Microbiol,2018

4. Detecting and correcting misclassified sequences in the large-scale public databases;Bagheri;Bioinformatics,2020

5. Genome-resolved metagenomics identifies genetic mobility, metabolic interactions, and unexpected diversity in perchlorate-reducing communities;Barnum;ISME J,2018

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Abundance, identity, and potential diazotrophic activity ofnifH-containing organisms at marine cold seeps;2024-07-31

2. Urea assimilation and oxidation supports the activity of a phylogenetically diverse microbial community in the dark ocean;2024-07-27

3. Intensification of harmful cyanobacterial blooms in a eutrophic, temperate lake caused by nitrogen, temperature, and CO2;Science of The Total Environment;2024-03

4. Biological nitrogen fixation and the role of soil diazotroph niche breadth in representative terrestrial ecosystems;Soil Biology and Biochemistry;2024-02

5. Autochthonous carbon loading of macroalgae stimulates benthic biological nitrogen fixation rates in shallow coastal marine sediments;Frontiers in Microbiology;2024-01-05