Abstract
AbstractMetabolic gene clusters (MGCs) encode at least three different enzymes for a common biosynthetic pathway. Comparative genome analyses highlighted the role of duplications, deletions and rearrangements in MGC formation. We hypothesized that these mechanisms also contribute to MGC intraspecies diversity and play a role in adaptation. We assessed copy number variations (CNVs) of four Arabidopsis thaliana MGCs in a population of 1,152 accessions, with experimental and bioinformatic approaches. The MGC diversity was lowest in marneral gene cluster (one private deletion CNV) and highest in the arabidiol/baruol gene cluster where 811 accessions had gene gains or losses, however, there were no presence/absence variations of the entire clusters. We found that the compact version of thalianol gene cluster was predominant in A. thaliana and more conserved than the noncontiguogus version. In arabidiol/baruol cluster we found a large insertion in 35% of analyzed accessions, that contained duplications of the reference genes CYP705A2 and BARS1. The BARS1 paralog, which we named BARS2, encoded a novel oxidosqualene synthase. Unexpectedly, in accessions with the insertion, the arabidiol/baruol gene cluster was expressed not only in roots but also in leaves. Additionally, they presented different root growth dynamics and were associated with warmer climates compared to the reference-like accessions. We also found that paired genes encoding terpene synthases and cytochrome P450 oxidases had higher copy number variability compared to non-paired ones. Our study highlights the importance of intraspecies variation and nonreference genomes for dissecting secondary metabolite biosynthesis pathways and understanding their role in adaptation and evolution.
Publisher
Cold Spring Harbor Laboratory