Abstract
AbstractComprehensive sampling of natural genetic diversity with metagenomics enables highly resolved insights into the interplay between ecology and evolution. However, intra-population genomic variation represents the outcome of both stochastic and selective forces, making it difficult to identify whether variants are maintained by adaptive, neutral, or purifying processes. This is partly due to the reliance on gene sequences to interpret variants, which disregards the physical properties of three-dimensional gene products that define the functional landscape on which selection acts. Here we describe an approach to analyze genetic variation in the context of predicted protein structures, and apply it to study a marine microbial population within the SAR11 subclade 1a.3.V, which dominates low-latitude surface oceans. Our analyses reveal a tight association between the patterns of nonsynonymous polymorphism, selective pressures, and structural properties of proteins such as per-site relative solvent accessibility and distance to ligands, which explain up to 59% of genetic variance in some genes. In glutamine synthetase, a central gene in nitrogen metabolism, we observe decreased occurrence of nonsynonymous variants from ligand binding sites as a function of nitrate concentrations in the environment, revealing genetic targets of distinct evolutionary pressures maintained by nutrient availability. Our data also reveals that rare codons are purified from ligand binding sites when genes are under high selection, demonstrating the utility of structure-aware analyses to study the variants that likely impact translational processes. Overall, our work yields insights into the governing principles of evolution that shape the genetic diversity landscape within a globally abundant population, and makes available a software framework for structure-aware investigations of microbial population genetics.SignificanceIncreasing availability of metagenomes offers new opportunities to study evolution, but the equal treatment of all variants limits insights into drivers of sequence diversity. By capitalizing on recent advances in protein structure prediction capabilities, our study examines subtle evolutionary dynamics of a microbial population that dominates surface oceans through the lens of structural biology. We demonstrate the utility of structure-informed metrics to understand the distribution of nonsynonymous polymorphism, establish insights into the impact of changing nutrient availability on protein evolution, and show that even synonymous variants are scrutinized strictly to maximize translational efficiency when selection is high. Overall, our work illustrates new opportunities for discovery at the intersection between metagenomics and structural bioinformatics, and offers an interactive and scalable software platform to visualize and analyze genetic variants in the context of predicted protein structures and ligand-binding sites.
Publisher
Cold Spring Harbor Laboratory