Abstract
AbstractAn estimated 2 billion species of microbes exist on Earth with orders of magnitude more strains. Microbial pangenomes are created by aggregating all genomes of a single clade and reflect the metabolic diversity of groups of organisms. As de novo metagenome analysis techniques have matured and reference genome databases have expanded, metapangenome analysis has risen in popularity as a tool to organize the functional potential of organisms in relation to the environment from which those organisms were sampled. However, the reliance on assembly and binning or on reference databases often leaves substantial portions of metagenomes unanalyzed, thereby underestimating the functional potential of a community. To address this challenge, we present a method for metapangenomics that relies on amino acid k-mers (kaa-mers) and metagenome assembly graph queries. To enable this method, we first show that kaa-mers estimate pangenome characteristics and that open reading frames can be accurately predicted from short shotgun sequencing reads using the previously developed tool orpheum. These techniques enable pangenomics to be performed directly on short sequencing reads. To enable metapangenome analysis, we combine these approaches with compact de Bruijn assembly graph queries to directly generate sets of sequencing reads for a specific species from a metagenome. When applied to stool metagenomes from an individual receiving antibiotics over time, we show that these approaches identify strain fluctuations that coincide with antibiotic exposure.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献