RiboReport - Benchmarking tools for ribosome profiling-based identification of open reading frames in bacteria-Reference-Cited by-同舟云学术

RiboReport - Benchmarking tools for ribosome profiling-based identification of open reading frames in bacteria

Published:2021-06-09 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Gelhausen Rick^ORCID,Müller Teresa^ORCID,Svensson Sarah L.^ORCID,Alkhnbashi Omer^ORCID,Sharma Cynthia M.^ORCID,Eggenhofer Florian^ORCID,Backofen Rolf^ORCID

Abstract

AbstractSmall proteins, those encoded by open reading frames, with less than or equal to 50 codons, are emerging as an important class of cellular macromolecules in all kingdoms of life. However, they are recalcitrant to detection by proteomics or in silico methods. Ribosome profiling (Ribo-seq) has revealed widespread translation of sORFs in diverse species, and this has driven the development of ORF detection tools using Ribo-seq read signals. However, only a handful of tools have been designed for bacterial data, and have not yet been systematically compared. Here, we have performed a comprehensive benchmark of ORF prediction tools which handle bacterial Ribo-seq data. For this, we created a novel Ribo-seq dataset for E. coli, and based on this plus three publicly available datasets for different bacteria, we created a benchmark set by manual labeling of translated ORFs using their Ribo-seq expression profile. This was then used to investigate the predictive performance of four Ribo-seq-based ORF detection tools we found are compatible with bacterial data (Reparation_blast, DeepRibo, Ribo-TISH and SPECtre). The tool IRSOM was also included as a comparison for tools using coding potential and RNA-seq coverage only. DeepRibo and Reparation_blast robustly predicted translated ORFs, including sORFs, with no significant difference for those inside or outside of operons. However, none of the tools was able to predict a set of recently identified, novel, experimentally-verified sORFs with high sensitivity. Overall, we find there is potential for improving the performance, applicability, usability, and reproducibility of prokaryotic ORF prediction tools that use Ribo-Seq as input.Key points

Created a benchmark set for Ribo-seq based ORF prediction in bacteria

DeepRibo the first choice for bacterial ORF prediction tasks

Tool performance is comparable between operon vs single gene regions

Identification of novel sORF with DeepRibo is, with restrictions, possible, by using the top 100 novel sORFs sorted by rank.

Experimental results show that considering translation initiation site data could boost the detection of novel small ORFs

Determination of novel sORFs in E. coli using a new experimental protocol to enrich for translation initiation site. These data-set shows that still a significant part (here 8 out 24, so 1/3) are not detected dispute sufficient Ribo-seq signal. An additional 7 could be recovered using translation initiation site protocols.

Tools should embrace the use of replicate data and improve packaging, usability and documentation.

Publisher

Cold Spring Harbor Laboratory

Reference59 articles.

1. Conservation analysis of the CydX protein yields insights into small protein identification and evolution

2. jvenn: an interactive Venn diagram viewer

3. Proteomics of sars-cov-2-infected host cells reveals therapy targets;Nature,2020

4. Detecting actively translated open reading frames in ribosome profiling data;Nature methods,2016

5. Beyond Read-Counts: Ribo-seq Data Analysis to Understand the Functions of the Transcriptome