Author:
Liu Hui,Yan Xue-Mei,Wang Xin-rui,Zhang Dong-Xu,Zhou Qingyuan,Shi Tian-Le,Jia Kai-Hua,Tian Xue-Chan,Zhou Shan-Shan,Zhang Ren-Gang,Yun Quan-Zheng,Wang Qing,Xiang Qiuhong,Mannapperuma Chanaka,Van Zalen Elena,Street Nathaniel R.,Porth Ilga,El-Kassaby Yousry A.,Zhao Wei,Wang Xiao-Ru,Guan Wenbin,Mao Jian-Feng
Abstract
In-depth genome characterization is still lacking for most of biofuel crops, especially for centromeres, which play a fundamental role during nuclear division and in the maintenance of genome stability. This study applied long-read sequencing technologies to assemble a highly contiguous genome for yellowhorn (Xanthoceras sorbifolium), an oil-producing tree, and conducted extensive comparative analyses to understand centromere structure and evolution, and fatty acid biosynthesis. We produced a reference-level genome of yellowhorn, ∼470 Mb in length with ∼95% of contigs anchored onto 15 chromosomes. Genome annotation identified 22,049 protein-coding genes and 65.7% of the genome sequence as repetitive elements. Long terminal repeat retrotransposons (LTR-RTs) account for ∼30% of the yellowhorn genome, which is maintained by a moderate birth rate and a low removal rate. We identified the centromeric regions on each chromosome and found enrichment of centromere-specific retrotransposons of LINE1 and Gypsy in these regions, which have evolved recently (∼0.7 MYA). We compared the genomes of three cultivars and found frequent inversions. We analyzed the transcriptomes from different tissues and identified the candidate genes involved in very-long-chain fatty acid biosynthesis and their expression profiles. Collinear block analysis showed that yellowhorn shared the gamma (γ) hexaploidy event with Vitis vinifera but did not undergo any further whole-genome duplication. This study provides excellent genomic resources for understanding centromere structure and evolution and for functional studies in this important oil-producing plant.