Secondary structure prediction of long noncoding RNA: review and experimental comparison of existing approaches

Author:

Bugnon L A1,Edera A A1,Prochetto S12,Gerard M1,Raad J1,Fenoy E1,Rubiolo M1,Chorostecki U3,Gabaldón T345,Ariel F2,Di Persia L E1,Milone D H1,Stegmayer G1

Affiliation:

1. Research Institute for Signals, Systems and Computational Intelligence sinc(i) (CONICET-UNL), Ciudad Universitaria , Santa Fe , Argentina

2. IAL, CONICET, Ciudad Universitaria UNL , (3000) Santa Fe , Argentina

3. Barcelona Supercomputing Center (BSC-CNS), Institute of Research in Biomedicine (IRB) , Spain

4. Catalan Institution for Research and Advanced Studies (ICREA) , Barcelona , Spain

5. Centro de Investigación Biomédica En Red de Enfermedades Infecciosas (CIBERINFEC) , Barcelona , Spain

Abstract

Abstract Motivation In contrast to messenger RNAs, the function of the wide range of existing long noncoding RNAs (lncRNAs) largely depends on their structure, which determines interactions with partner molecules. Thus, the determination or prediction of the secondary structure of lncRNAs is critical to uncover their function. Classical approaches for predicting RNA secondary structure have been based on dynamic programming and thermodynamic calculations. In the last 4 years, a growing number of machine learning (ML)-based models, including deep learning (DL), have achieved breakthrough performance in structure prediction of biomolecules such as proteins and have outperformed classical methods in short transcripts folding. Nevertheless, the accurate prediction for lncRNA still remains far from being effectively solved. Notably, the myriad of new proposals has not been systematically and experimentally evaluated. Results In this work, we compare the performance of the classical methods as well as the most recently proposed approaches for secondary structure prediction of RNA sequences using a unified and consistent experimental setup. We use the publicly available structural profiles for 3023 yeast RNA sequences, and a novel benchmark of well-characterized lncRNA structures from different species. Moreover, we propose a novel metric to assess the predictive performance of methods, exclusively based on the chemical probing data commonly used for profiling RNA structures, avoiding any potential bias incorporated by computational predictions when using dot-bracket references. Our results provide a comprehensive comparative assessment of existing methodologies, and a novel and public benchmark resource to aid in the development and comparison of future approaches. Availability Full source code and benchmark datasets are available at: https://github.com/sinc-lab/lncRNA-folding Contact lbugnon@sinc.unl.edu.ar

Funder

ANPCyT

Santa Fe Science, Technology and Innovation Agency

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Cited by 19 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3