Abstract
Advances in genomic sequencing have recently offered vast opportunities for biological exploration, unraveling the evolution and improving our understanding of Earth biodiversity. Due to distinct plant species characteristics in terms of genome size, ploidy and heterozygosity, transposable elements (TEs) are common characteristics of many genomes. TEs are ubiquitous and dispersed repetitive DNA sequences that frequently impact the evolution and composition of the genome, mainly due to their redundancy and rearrangements. For this study, we provided an atlas of TE data by employing an easy-to-use portal (APTE website). To our knowledge, this is the most extensive and standardized analysis of TEs in plant genomes. We evaluated 67 plant genomes assembled at chromosome scale, recovering a total of 49,802,023 TE records, representing a total of 47,992,091,043 (~47,62%) base pairs (bp) of the total genomic space. We observed that new types of TEs were identified and annotated compared to other data repositories. By establishing a standardized catalog of TE annotation on 67 genomes, new hypotheses, exploration of TE data and their influences on the genomes may allow a better understanding of their function and processes. All original code and an example of how we developed the TE annotation strategy is available on GitHub (Extended data).
Funder
Conselho Nacional de Desenvolvimento Científico e Tecnológico
NVIDIA from the GPU Grant Program 2019 - Accelerated Data Science Call for the GPU Seed Units: Titan V device
PROPPG - UTFPR
NAPI de Bioinformática - Fundação Araucária
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil
Subject
General Pharmacology, Toxicology and Pharmaceutics,General Immunology and Microbiology,General Biochemistry, Genetics and Molecular Biology,General Medicine
Cited by
15 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献