UFold: fast and accurate RNA secondary structure prediction with deep learning

Author:

Fu Laiyi12ORCID,Cao Yingxin234,Wu Jie5,Peng Qinke1,Nie Qing634,Xie Xiaohui2

Affiliation:

1. Systems Engineering Institute, School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, Shaanxi 710049, China

2. Department of Computer Science, University of California, Irvine, CA 92697, USA

3. Center for Complex Biological Systems, University of California, Irvine, CA 92697, USA

4. NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, CA 92697, USA

5. Department of Biological Chemistry, University of California, Irvine, CA 92697, USA

6. Department of Mathematics, University of California, Irvine, CA 92697, USA

Abstract

Abstract For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here, we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data and base-pairing rules. UFold proposes a novel image-like representation of RNA sequences, which can be efficiently processed by Fully Convolutional Networks (FCNs). We benchmark the performance of UFold on both within- and cross-family RNA datasets. It significantly outperforms previous methods on within-family datasets, while achieving a similar performance as the traditional methods when trained and tested on distinct RNA families. UFold is also able to predict pseudoknots accurately. Its prediction is fast with an inference time of about 160 ms per sequence up to 1500 bp in length. An online web server running UFold is available at https://ufold.ics.uci.edu. Code is available at https://github.com/uci-cbcl/UFold.

Funder

NSF

NIH

Simons Foundation

Publisher

Oxford University Press (OUP)

Subject

Genetics

Reference63 articles.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3