Abstract
AbstractSummaryUpstream open reading frames (uORFs, encoding so-called leader peptides) can regulate translation and transcription of downstream main ORFs (mORFs) in prokaryotes and eukaryotes. However, annotation of novel functional uORFs is challenging due their short size of usually less than 100 codons. While transcription- and translation-level next generation sequencing (NGS) methods can be used for genome-wide uORF identification, this data is not available for the vast majority of species with sequenced genomes. At the same time, the exponentially increasing amount of genome assemblies gives us the opportunity to take advantage of evolutionary conservation in our predictions of ORFs.Here we present a tool for conserved uORF annotation in 5′ upstream sequences of a user-defined protein of interest or a set of protein homologues. It can also be used to find small ORFs within a set of nucleotide sequences. The output includes publication-quality figures with multiple sequence alignments, sequence logos and locus annotation of the predicted uORFs in graphical vector format.Availability and ImplementationuORF4u is written in Python3 and runs on Linux and MacOS. The command-line interface covers most practical use cases, while the provided Python API allows usage within a Python program and additional customisation. Source code is available from the GitHub page:https://github.com/art-egorov/uorf4u. Detailed documentation that includes an example-driven guide available at the software home page:https://art-egorov.github.io/uorf4u.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献