Abstract
ABSTRACTEmerging evidence places small proteins (≤ 50 amino acids) more centrally in physiological processes. Yet, the identification of functional small proteins and the systematic genome annotation of their cognate small open reading frames (smORFs) remains challenging both experimentally and computationally. Ribosome profiling or Ribo-Seq (that is a deep sequencing of ribosome-protected fragments) enables detecting of actively translated open-reading frames (ORFs) and empirical annotation of coding sequences (CDSs) using the in-register translation pattern that is characteristic for genuinely translating ribosomes. Multiple identifiers of ORFs that use 3-nt periodicity in Ribo-Seq data sets have been successful in eukaryotic smORF annotation. Yet, they have difficulties evaluating prokaryotic genomes due to the unique architecture of prokaryotic genomes (e.g. polycistronic messages, overlapping ORFs, leaderless translation, non-canonical initiation etc.). Here, we present our new algorithm, smORFer, which performs with high accuracy in prokaryotic organisms in detecting smORFs. The unique feature of smORFer is that it uses integrated approach and considers structural features of the genetic sequence along with in-register translation and uses Fourier transform to convert these parameters into a measurable score to faithfully select smORFs. The algorithm is executed in a modular way and dependent on the data available for a particular organism allows using different modules for smORF search.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献