Abstract
Abstract
Background
As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored.
Results
We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data and randomly select one 149 bp read as a model. Specific reverse transcription primers are designed to amplify its both ends, followed by next generation sequencing. Then we design a statistical model based on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1556 bp, with insertion mutations in microsatellite structure.
Conclusion
We believe this method would be a useful strategy to extract the sequences information from the unmapped RNA-seq data. Further, it is an alternative way to get the full length sequence of unknown cDNA.
Funder
National Natural Science Foundation of China
The Agricultural Science and Technology Innovation Program
The Elite Young Scientists Program of Chinese Academy of Agricultural Sciences
Natural Science Foundation of Guangdong Province
National Key Research and Development Program of China
Science and Technology Planning Project of Shenzhen Municipality
Publisher
Springer Science and Business Media LLC
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献