Abstract
AbstractRibonucleic acid (RNA) is a fundamental biological molecule that is essential to all living organisms, performing a versatile array of cellular tasks. The function of many RNA molecules is strongly related to the structure it adopts. As a result, great effort is being dedicated to the design of efficient algorithms that solve the “folding problem”: given a sequence of nucleotides, return a probable list of base pairs, referred to as the secondary structure prediction. Early algorithms have largely relied on finding the structure with minimum free energy. However, the predictions rely on effective simplified free energy models that may not correctly identify the correct structure as the one with the lowest free energy. In light of this, new, data-driven approaches that not only consider free energy, but also use machine learning techniques to learn motifs have also been investigated, and have recently been shown to outperform free energy based algorithms on several experimental data sets.In this work, we introduce the new ExpertRNA algorithm that provides a modular framework which can easily incorporate an arbitrary number of rewards (free energy or non-parametric/data driven) and secondary structure prediction algorithms. We argue that this capability of ExpertRNA has the potential to balance out different strengths and weaknesses of state-of-the-art folding tools. We test the ExpertRNA on several RNA sequence-structure data sets, and we compare the performance of ExpertRNA against a state-of-the-art folding algorithm. We find that ExpertRNA produces, on average, more accurate predictions than the structure prediction algorithm used, thus validating the promise of the approach.
Publisher
Cold Spring Harbor Laboratory
Reference52 articles.
1. Ensemble-based prediction of RNA secondary structures;BMC bioinformatics,2013
2. Evaluation of time profile reconstruction from complex two-color microarray designs
3. Efficient parameter estimation for RNA secondary structure prediction
4. Computational approaches for RNA energy parameter estimation
5. Angela, M. Y. , P. M. Gasper , E. J. Strobel , K. E. Watters , A. A. Chen , and J. B. Lucks . 2018. “Computationally Reconstructing Cotranscriptional RNA Folding Pathways from Experimental Data Reveals Rearrangement of Non-Native Folding Intermediates”. bioRxiv:379222.