Abstract
Long non-coding RNAs (lncRNAs) have been widely studied for their important biological significance. In general, we need to distinguish them from protein coding RNAs (pcRNAs) with similar functions. Based on various strategies, algorithms and tools have been designed and developed to train and validate such classification capabilities. However, many of them lack certain scalability, versatility, and rely heavily on genome annotation. In this paper, we design a convenient and biologically meaningful classification tool "Prelnc2" using multi-scale position and frequency information of wavelet transform spectrum and generalizes the frequency statistics method. Finally, we used the extracted features and auxiliary features together to train the model and verify it with test data. PreLnc2 achieved 93.2% accuracy for animal and plant transcripts, outperforming PreLnc by 2.1% improvement and our method provides an effective alternative to the prediction of lncRNAs.
Funder
Shaanxi Provincial Science and Technology Department
Publisher
Public Library of Science (PLoS)
Reference32 articles.
1. Cedric. Expression of long non-coding RNA ANRIL predicts a poor prognosis in intrahepatic cholangiocarcinoma;JC Angenard;Digestive and liver disease: official journal of the Italian Society of Gastroenterology and the Italian Association for the Study of the Liver,2019
2. Transcriptional regulation of macrophage cholesterol efflux and atherogenesis by a long noncoding RNA;T Sallam;Nature Medicine,2018
3. Epigenetic Regulation by Long Noncoding RNAs;JT Lee;Science,2012
4. High-throughput functional analysis of lncRNA core promoters elucidates rules governing tissue-specificity;K Mattioli;Genome Research,2019
5. Genetic Control of Biochemical Reactions in Neurospora;GW Beadle;Proceedings of the National Academy of Sciences,1941
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献