Abstract
AbstractCircular RNA is a novel class of endogenous non-coding RNAs that have been largely discovered in eukaryotic transcriptome. The circular structure arises from a non-canonical splicing process, where the donor site backsplices to an upstream acceptor site. These circular form of RNAs are conserved across species, and often show tissue or cell-specific expression. Emerging evidences have suggested its vital roles in gene regulation, which are further associated with various types of diseases. As the fundamental effort to elucidate its function and mechanism, numerous efforts have been devoted to predicting circular RNA from its primary sequence. However, statistical learning methods are constrained by the information presented with explicit features, and the existing deep learning approach falls short on fully exploring the positional information of the splice sites and their deep interaction.We present an effective and robust end-to-end framework, JEDI, for circular RNA prediction using only the nucleotide sequence. Our framework first leverages the attention mechanism to encode each junction site based on deep bidirectional recurrent neural networks and then presents the novel cross-attention layer to model deep interaction among these sites for backsplicing. Finally, JEDI is capable of not only addressing the task of circular RNA prediction but also interpreting the relationships among splice sites to discover the hotspots for backsplicing within a gene region. Experimental evaluations demonstrate that JEDI significantly outperforms several state-of-the-art approaches in circular RNA prediction on both isoform-level and gene-level. Moreover, JEDI also shows promising results on zero-shot backsplicing discovery, where none of the existing approaches can achieve.The implementation of our framework is available at https://github.com/hallogameboy/JEDI.
Publisher
Cold Spring Harbor Laboratory
Reference51 articles.
1. Martín Abadi , Paul Barham , Jianmin Chen , Zhifeng Chen , Andy Davis , Jeffrey Dean , Matthieu Devin , Sanjay Ghemawat , Geoffrey Irving , Michael Isard , et al. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16), pages 265–283, (2016).
2. circRNA Biogenesis Competes with Pre-mRNA Splicing
3. Dzmitry Bahdanau , Kyunghyun Cho , and Yoshua Bengio Neural machine translation by jointly learning to align and translate. In 3rd International Conference on Learning Representations, ICLR 2015, (2015).
4. Circular RNAs: analysis, expression and potential functions
5. Marcel Boss and Christoph Arenz . A fast and easy method for specific detection of circular rna by rolling-circle amplification. ChemBioChem, (2019).