An independent evaluation in a CRC patient cohort of microbiome 16S rRNA sequence analysis methods: OTU clustering, DADA2, and Deblur-Reference-Cited by-同舟云学术

An independent evaluation in a CRC patient cohort of microbiome 16S rRNA sequence analysis methods: OTU clustering, DADA2, and Deblur

Published:2023-07-25 Issue: Volume:14 Page:
ISSN:1664-302X
Container-title:Frontiers in Microbiology
language:
Short-container-title:Front. Microbiol.

Author:

Liu Guang,Li Tong,Zhu Xiaoyan,Zhang Xuanping,Wang Jiayin

Abstract

16S rRNA is the universal gene of microbes, and it is often used as a target gene to obtain profiles of microbial communities via next-generation sequencing (NGS) technology. Traditionally, sequences are clustered into operational taxonomic units (OTUs) at a 97% threshold based on the taxonomic standard using 16S rRNA, and methods for the reduction of sequencing errors are bypassed, which may lead to false classification units. Several denoising algorithms have been published to solve this problem, such as DADA2 and Deblur, which can correct sequencing errors at single-nucleotide resolution by generating amplicon sequence variants (ASVs). As high-resolution ASVs are becoming more popular than OTUs and only one analysis method is usually selected in a particular study, there is a need for a thorough comparison of OTU clustering and denoising pipelines. In this study, three of the most widely used 16S rRNA methods (two denoising algorithms, DADA2 and Deblur, along with de novo OTU clustering) were thoroughly compared using 16S rRNA amplification sequencing data generated from 358 clinical stool samples from the Colorectal Cancer (CRC) Screening Cohort. Our findings indicated that all approaches led to similar taxonomic profiles (with P > 0.05 in PERMNAOVA and P <0.001 in the Mantel test), although the number of ASVs/OTUs and the alpha-diversity indices varied considerably. Despite considerable differences in disease-related markers identified, disease-related analysis showed that all methods could result in similar conclusions. Fusobacterium, Streptococcus, Peptostreptococcus, Parvimonas, Gemella, and Haemophilus were identified by all three methods as enriched in the CRC group, while Roseburia, Faecalibacterium, Butyricicoccus, and Blautia were identified by all three methods as enriched in the healthy group. In addition, disease-diagnostic models generated using machine learning algorithms based on the data from these different methods all achieved good diagnostic efficiency (AUC: 0.87–0.89), with the model based on DADA2 producing the highest AUC (0.8944 and 0.8907 in the training set and test set, respectively). However, there was no significant difference in performance between the models (P >0.05). In conclusion, this study demonstrates that DADA2, Deblur, and de novo OTU clustering display similar power levels in taxa assignment and can produce similar conclusions in the case of the CRC cohort.

Publisher

Frontiers Media SA

Subject

Microbiology (medical),Microbiology

Reference71 articles.

1. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries;Aird;Genome Biol,2011

2. A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome;Allali;BMC Microbiol,2017

3. Deblur rapidly resolves single-nucleotide community sequence patterns;Amir;mSystems,2017

4. Racial disparity in gastrointestinal cancer risk;Ashktorab;Gastroenterology,2017

5. Constructing confidence sets using rank statistics;Bauer;J. Am. Stat. Assoc,1972

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Taxonomic composition and functional potentials of gastrointestinal microbiota in 12 wild-stranded cetaceans;Frontiers in Microbiology;2024-08-29

2. Datasets of fungal diversity and pseudo-chromosomal genomes of mangrove rhizosphere soil in China;Scientific Data;2024-08-20

3. Standardising a microbiome pipeline for body fluid identification from complex crime scene stains;2024-08-07

4. Inventorizing marine biodiversity using eDNA data from Indonesian coral reefs: comparative high throughput analysis using different bioinformatic pipelines;Marine Biodiversity;2024-04-05

5. Investigating the causal role of the gut microbiota in esophageal cancer and its subtypes: a two-sample Mendelian randomization study;BMC Cancer;2024-04-04