Abstract
AbstractAluexonization, or the recruitment of intronicAluelements into gene sequences, has contributed to functional diversification; however, its extent and the ways in which it influences gene regulation are not fully understood. We developed an unbiased approach to predictAluexonization events from genomic sequences implemented in a deep learning model, eXAlu, that overcomes the limitations of tissue or condition specificity and the computational burden of RNA-seq analysis. The model captures previously reported characteristics of exonizedAlusequences and can predict sequence elements important forAluexonization. Using eXAlu, we estimate the number ofAluelements in the human genome undergoing exonization to be between 55-110K, 11-21 fold more than represented in the GENCODE gene database. Using RT-PCR we were able to validate selected predictedAluexonization events, supporting the accuracy of our method. Lastly, we highlight a potential application of our method to identify polymorphicAluinsertion exonizations in individuals and in the population from whole genome sequencing data.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献