Abstract
AbstractBackground:Sequencing of the 16S rRNA gene has been the standard for studying the composition of microbial communities. While it allows identification of bacteria at the level of species, it does not usually provide sufficient information to resolve at the sub-species level. Species-level resolution is not adequate for studies of transmission or stability, or for exploring subspecies variation in disease association. Current approaches using whole metagenome shotgun sequencing require very high coverage that can be cost-prohibitive and computationally challenging for diverse communities. Thus there is a need for high-resolution, yet cost-effective, high-throughput methods for characterizing microbial communities.Results:Significant improvement in resolution for amplicon-based bacterial community analysis was achieved by combining amplicon sequencing of a high-diversity marker gene, the ribosomal operon ISR, with a probabilistic error modeling algorithm, DADA2. The resolving power of this new approach was compared to that of both standard and high-resolution 16S-based approaches using a set of longitudinal subgingival plaque samples. The ISR strategy achieved a 5.2-fold increase in community richness compared to reference-based 16S rRNA gene analysis, and showed 100% accuracy in predicting the correct source of a clinical sample. Individuals’ microbial communities were highly personalized, and although they exhibited some drift in membership and levels over time, that difference was always smaller than the differences between any two subjects, even after one year. The construction of an ISR database from publicly available genomic sequences allowed us to explore genomic variationwithinspecies, resulting in the identification of multiple variants of the ISR for most species.Conclusions:The ISR approach resulted in significantly improved resolution of communities, and revealed a highly personalized, stable human oral microbiota. Multiple ISR types were observed for all species examined, demonstrating a high level of subspecies variation in the oral microbiota. The approach is high-throughput, high-resolution yet cost-effective, allowing subspecies-level community fingerprinting at a cost comparable to that of 16S rRNA gene amplicon sequencing. It will be useful for a range of applications that require high-resolution identification of organisms, including microbial tracking, community fingerprinting, and potentially for identification of virulence-associated strains.
Publisher
Cold Spring Harbor Laboratory