Abstract
The development of multiple high-quality reference genome sequences in many taxonomic groups has yielded a high-resolution view of the patterns and processes of molecular evolution. Nonetheless, leveraging information across multiple reference haplotypes remains a significant challenge in nearly all eukaryotic systems. These challenges range from studying the evolution of chromosome structure, to finding candidate genes for quantitative trait loci, to testing hypotheses about speciation and adaptation in nature. Here, we address these challenges through the concept of a pan-genome annotation, where conserved gene order is used to restrict gene families and define the expected physical position of all genes that share a common ancestor among multiple genome annotations. By leveraging pan-genome annotations and exploring the underlying syntenic relationships among genomes, we dissect presence-absence and structural variation at four levels of biological organization: among three tetraploid cotton species, across 300 million years of vertebrate sex chromosome evolution, across the diversity of the Poaceae (grass) plant family, and among 26 maize cultivars. The methods to build and visualize syntenic pan-genome annotations in the GENESPACE R package offer a significant addition to existing gene family and synteny programs, especially in polyploid, outbred and other complex genomes.
Publisher
Cold Spring Harbor Laboratory
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献