OperonSEQer: A set of machine-learning algorithms with threshold voting for detection of operon pairs using short-read RNA-sequencing data-Reference-Cited by-同舟云学术

OperonSEQer: A set of machine-learning algorithms with threshold voting for detection of operon pairs using short-read RNA-sequencing data

Published:2022-01-05 Issue:1 Volume:18 Page:e1009731
ISSN:1553-7358
Container-title:PLOS Computational Biology
language:en
Short-container-title:PLoS Comput Biol

Author:

Krishnakumar Raga^ORCID,Ruffing Anne M.^ORCID

Abstract

Operon prediction in prokaryotes is critical not only for understanding the regulation of endogenous gene expression, but also for exogenous targeting of genes using newly developed tools such as CRISPR-based gene modulation. A number of methods have used transcriptomics data to predict operons, based on the premise that contiguous genes in an operon will be expressed at similar levels. While promising results have been observed using these methods, most of them do not address uncertainty caused by technical variability between experiments, which is especially relevant when the amount of data available is small. In addition, many existing methods do not provide the flexibility to determine the stringency with which genes should be evaluated for being in an operon pair. We present OperonSEQer, a set of machine learning algorithms that uses the statistic and p-value from a non-parametric analysis of variance test (Kruskal-Wallis) to determine the likelihood that two adjacent genes are expressed from the same RNA molecule. We implement a voting system to allow users to choose the stringency of operon calls depending on whether your priority is high recall or high specificity. In addition, we provide the code so that users can retrain the algorithm and re-establish hyperparameters based on any data they choose, allowing for this method to be expanded as additional data is generated. We show that our approach detects operon pairs that are missed by current methods by comparing our predictions to publicly available long-read sequencing data. OperonSEQer therefore improves on existing methods in terms of accuracy, flexibility, and adaptability.

Funder

Sandia National Laboratories

Publisher

Public Library of Science (PLoS)

Subject

Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics

Reference63 articles.

1. Diversity, versatility and complexity of bacterial gene regulation mechanisms: opportunities and drawbacks for applications in synthetic biology;I Bervoets;FEMS Microbiol Rev,2019

2. A systematic pipeline for classifying bacterial operons reveals the evolutionary landscape of biofilm machineries.;C Bundalovic-Torma;PLoS Comput Biol,2020

3. Extensive reshaping of bacterial operons by programmed mRNA decay.;D Dar;PLoS Genet.,2018

4. Operons.;AE Osbourn;Cell Mol Life Sci,2009

5. Noncontiguous operon is a genetic organization for coordinating bacterial gene expression;S Saenz-Lahoya;Proc Natl Acad Sci U S A,2019

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Noncontiguous operon atlas for the Staphylococcus aureus genome;microLife;2024

2. Influence of genomic variations on glanders serodiagnostic antigens using integrative genomic and transcriptomic approaches;Frontiers in Veterinary Science;2023-12-06

3. Methodologies for bacterial ribonuclease characterization using RNA-seq;FEMS Microbiology Reviews;2023-09-01

4. Characterization of radiation-resistance mechanism in Spirosoma montaniterrae DY10T in terms of transcriptional regulatory system;Scientific Reports;2023-03-23