Abstract
Background Mobile elements (MEs) constitute a major portion of the genome in primates and other higher eukaryotes, and they play important role in genome evolution and gene function. MEs can be divided into two fundamentally different classes: DNA transposons which transpose in the genome in a “cut-and-paste” style, and retrotransposons which propagate in a “copy-and-paste” fashion via a process involving transcription and reverse-transcription. In primate genomes, DNA transposons are mostly dead, while many retrotransposons are still highly active. We report here the identification of a new type of MEs, which we call “retro-DNAs”, for their combined characteristics of these two fundamentally different ME classes. Methods A comparative computational genomic approach was used to analyze the reference genome sequences of 10 primate species consisting of five apes, four monkeys, and marmoset. Results From our analysis, we identified a total of 1,750 retro-DNAs, representing 748 unique insertion events in the genomes of ten primate species including human. These retro-DNAs contain sequences of DNA transposons but lack the terminal inverted repeats (TIRs), the hallmark of DNA transposons. Instead, they show characteristics of retrotransposons, such as polyA tails, longer target-site duplications (TSDs), and the “TT/AAAA” insertion site motif, suggesting the use of the L1-based target-primed reverse transcription (TPRT) mechanism. At least 40% of these retro-DNAs locate into genic regions, presenting potentials for impacting gene function. More interestingly, some retro-DNAs, as well as their parent sites, show certain levels of expression, suggesting that they have the potential to create more retro-DNA copies in the present primate genomes. Conclusions Although small in number, the identification of these retro-DNAs reveals a new mechanism for propagating DNA transposons in primate genomes without active canonical DNA transposon activity. Our data also suggest that the TPRT machinery may transpose a wider variety of DNA sequences in the genomes.
Funder
Ontario Research Foundation
Natural Sciences and Engineering Research Council of Canada
Canada Foundation for Innovation
Canada Research Chairs