Abstract
1.AbstractLong non-coding RNAs (lncRNAs) regulate gene expression through different molecular mechanisms, including DNA binding. We curated the first database of RNA Binding Sites (RNABSdb) by harmonising publicly available raw-data of RNA-DNA binding experiments. This resource is crucial to enable systematic studies on transcriptional regulation driven by lncRNAs. Focusing on high quality experiments, we find that the number of binding sites for each lncRNAs varies from hundreds to tens of thousands. Despite being poorly characterised, the formation of RNA:DNA:DNA triple helices (TPXs) is one of the molecular mechanisms that allows lncRNAs to bind the genome and regulate gene expression. We developed 3plex, a software able to predict TPXs in silico. We show that 3plex outperforms previous existing approaches leveraging the data collected in RNABSdb for lncRNAs known to form functional TPXs. Moreover this analysis shows that TPXs tend to be shorter and more degenerated than previously expected. Finally, we applied 3plex to all the lncRNAs collected in RNABSdb and we show that the majority of them could directly bind the genome by TPXs formation.Data and software are available at https://molinerislab.github.io/RNABSdb/ and https://github.com/molinerisLab/3plex.
Publisher
Cold Spring Harbor Laboratory