Abstract
Galega orientalis, a leguminous herb in the Fabaceae family, is an ecologically and economically important species widely cultivated for its strong stress resistance and high protein content. However, genomic information of Galega orientalis has not been reported, which limiting its evolutionary analysis. The small genome size makes chloroplast relatively easy to obtain genomic sequence for phylogenetic studies and molecular marker development. Here, the chloroplast genome of Galega orientalis was sequenced and annotated. The results showed that the chloroplast genome of G. orientalis is 125,280 bp in length with GC content of 34.11%. A total of 107 genes were identified, including 74 protein-coding genes, 29 tRNAs and four rRNAs. One inverted repeat (IR) region was lost in the chloroplast genome of G. orientalis. In addition, five genes (rpl22, ycf2, rps16, trnE-UUC and pbf1) were lost compared with the chloroplast genome of its related species G. officinalis. A total of 84 long repeats and 68 simple sequence repeats were detected, which could be used as potential markers in the genetic studies of G. orientalis and related species. We found that the Ka/Ks values of three genes petL, rpl20, and ycf4 were higher than one in the pairwise comparation of G. officinalis and other three Galegeae species (Calophaca sinica, Caragana jubata, Caragana korshinskii), which indicated those three genes were under positive selection. A comparative genomic analysis of 15 Galegeae species showed that most conserved non-coding sequence regions and two genic regions (ycf1 and clpP) were highly divergent, which could be used as DNA barcodes for rapid and accurate species identification. Phylogenetic trees constructed based on the ycf1 and clpP genes confirmed the evolutionary relationships among Galegeae species. In addition, among the 15 Galegeae species analyzed, Galega orientalis had a unique 30-bp intron in the ycf1 gene and Tibetia liangshanensis lacked two introns in the clpP gene, which is contrary to existing conclusion that only Glycyrrhiza species in the IR lacking clade (IRLC) lack two introns. In conclusion, for the first time, the complete chloroplast genome of G. orientalis was determined and annotated, which could provide insights into the unsolved evolutionary relationships within the genus Galegeae.
Subject
Genetics (clinical),Genetics