Affiliation:
1. College of Computer Science and Technology, Jilin University, Changchun, China
2. Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education, Jilin University, Changchun, China
Abstract
Traditional machine learning methods are widely used in the field of RNA secondary structure prediction and have achieved good results. However, with the emergence of large-scale data, deep learning methods have more advantages than traditional machine learning methods. As the number of network layers increases in deep learning, there will often be problems such as increased parameters and overfitting. We used two deep learning models, GoogLeNet and TCN, to predict RNA secondary results. And from the perspective of the depth and width of the network, improvements are made based on the neural network model, which can effectively improve the computational efficiency while extracting more feature information. We process the existing real RNA data through experiments, use deep learning models to extract useful features from a large amount of RNA sequence data and structure data, and then predict the extracted features to obtain each base’s pairing probability. The characteristics of RNA secondary structure and dynamic programming methods are used to process the base prediction results, and the structure with the largest sum of the probability of each base pairing is obtained, and this structure will be used as the optimal RNA secondary structure. We, respectively, evaluated GoogLeNet and TCN models based on 5sRNA, tRNA data, and tmRNA data, and compared them with other standard prediction algorithms. The sensitivity and specificity of the GoogLeNet model on the 5sRNA and tRNA data sets are about 16% higher than the best prediction results in other algorithms. The sensitivity and specificity of the GoogLeNet model on the tmRNA dataset are about 9% higher than the best prediction results in other algorithms. As deep learning algorithms’ performance is related to the size of the data set, as the scale of RNA data continues to expand, the prediction accuracy of deep learning methods for RNA secondary structure will continue to improve.
Funder
Development and Reform Commission of Jilin Province
Subject
Health Informatics,Biomedical Engineering,Surgery,Biotechnology
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献