Abstract
AbstractTranscription of many bacterial genes is regulated by alternative RNA polymerase sigma factors as the sigma 54 (σ54). A single essential σ promotes transcription of thousands of genes and many alternative σ factors promote transcription of multiple specialized genes required for coping with stress or development. Bacterial genomes have two families of sigma factors, sigma 70 (σ70) and sigma 54 (σ54). σ54 uses a more complex mechanism with specialized enhancers-binding proteins and DNA melting and is well known for its role in regulation of nitrogen metabolism in proteobacteria. The identification of these regulatory elements is the main step to understand the metabolic networks. In this study, we propose a supervised pattern recognition model with neural network to identify Transcription Factor Binding Sites (TFBSs) for σ54. This approach is capable of detecting σ54 TFBSs with sensitivity higher than 98% in recent published data. False positives are reduced with the addition of ANN and feature extraction, which increase the specificity of the program. We also propose a free, fast and friendly tool for σ54 recognition and a σ54 related genes database, available for consult. S54Finder can analyze from short DNA sequences to complete genomes and is available online. The software was used to determine σ54 TFBSs on the complete bacterial genomes database from NCBI and the result is available for comparison. S54Finder does the identification of σ54 regulated genes for a large set of genomes allowing evolutionary and conservation studies of the regulation system between the organisms.
Publisher
Cold Spring Harbor Laboratory