Abstract
AbstractThe “universal target” region of the gene encoding the 60 kDa chaperonin protein (cpn60, also known as groEL or hsp60) is a proven sequence barcode for bacteria and a useful target for marker gene amplicon-based studies of complex microbial communities. To date, identification of cpn60 sequence variants from microbiome studies has been accomplished by alignment of queries to a reference database. Naïve Bayesian classifiers offer an alternative identification method that provides variable rank classification and shorter analysis times. We curated a set of cpn60 barcode sequences to train the RDP classifier and tested its performance on data from previous human microbiome studies. Results showed that sequences accounting for 79%, 86% and 92% of the observations (read counts) in saliva, vagina and infant stool microbiome data sets were classified to the species rank. We also trained the QIIME 2 q2-feature-classifier on cpn60 sequence data and demonstrated that it gives results consistent with the standalone RDP classifier. Successful implementation of a naïve Bayesian classifier for cpn60 sequences will facilitate future microbiome studies and open opportunities to integrate cpn60 amplicon sequence identification into existing analysis pipelines.
Funder
Gouvernement du Canada | Natural Sciences and Engineering Research Council of Canada
Publisher
Springer Science and Business Media LLC
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献