Abstract
AbstractEstimating RNA modifications from Nanopore direct RNA sequencing data is an important task for the RNA research community. Current computational methods could not provide satisfactory results due to the inaccurate segmentation of the raw signal. We develop a new method, SegPore, that utilizes a molecular jiggling translocation hypothesis to segment the raw signal. SegPore is a pure white-box model with a superior interpretability, which significantly reduces structured noise in the raw signal. Based on the improved signal segmentation, SegPore+m6Anet has achieved state-of-the-art performance in m6A identification. Additionally, we demonstrate SegPore’s interpretable results and decent performances on inosine modification estimation and RNA secondary structure estimation. An interesting discovery in RNA structure estimation is that the end points of the reads take place at the start of stem structures along the reverse transcription direction. Our results indicate SegPore’s capability to concurrently estimate multiple modifications at the individual molecule level from the same Nanopore direct RNA sequencing data, as well as shed light on RNA structure estimation from a novel angle.
Publisher
Cold Spring Harbor Laboratory