Abstract
AbstractBacterial genetic discontinuity, representing abrupt breaks in genomic identity among species, is crucial for grasping microbial diversity and evolution. Advances in genomic sequencing have enhanced our ability to track and characterize genetic discontinuity in bacterial populations. However, exploring systematically the degree to which bacterial diversity exists as a continuum or is sorted into discrete and readily defined species remains a challenge in microbial ecology. Here, we aimed to quantify the genetic discontinuity (δ) and investigate how this metric is related to ecology. We harnessed a dataset comprising 210,129 genomes to systematically explore genetic discontinuity patterns across several distantly related species, finding clear breakpoints which varied depending on the taxa in question. By delving into pangenome characteristics, we uncovered a significant association between pangenome saturation and genetic discontinuity. Closed pangenomes were associated with more pronounced breaks, exemplified byMycobacterium tuberculosis. Additionally, through a machine learning approach, we detected key features that impact genetic discontinuity prediction. Our study enhances the understanding of bacterial genetic patterns and their ecological implications, offering insights into species boundaries for prokaryotes.
Publisher
Cold Spring Harbor Laboratory