Abstract
AbstractBackgroundTranscription factor (TF) regulates the transcription of DNA to messenger RNA by binding to upstream sequence motifs. Identifying the locations of known motifs in whole genomes is computationally intensive.Methodology/Principal FindingsThis study presents a computational tool, named “Grit”, for screening TF-binding sites (TFBS) by coordinating transcription factors to their promoter sequences in orthologous genes. This tool employs a newly developed mixed Student’s t-test statistical method that detects high-scoring conserved and non-conserved binding sites among species. The program performs sequence scanning at a rate of 3.2 Mb/s on a quad-core Amazon server and has been benchmarked by the well-established ChIP-Seq datasets, putting Grit amongst the top-ranked TFBS predictors. It marginally outperforms the well-known transcription factor motif scanning tools, Pscan (4.8%) and FIMO (17.8%), in analyzing well-documented ChIP-Atlas human genome Chip-Seq datasets.SignificanceGrit is a good alternative to current available motif scanning tools and is publicly available at http://www.thua45.cn/grit under an academic free license.Author SummaryLocating transcription factor-binding (TF-binding) site in the genome and identification their function is fundamental in understanding various biological processes. Improve the performance of the prediction tools is important because accurate TF-binding site prediction can save cost and time for wet-lab experiments. Also, genome wide TF-binding site prediction can provide new insights for transcriptome regulation in system biology perspective. This study developed a new TF-binding site prediction tool based on mixed Student’s t-test statistical method. The tool is amongst the top-ranked TF-binding site predictors, as such, it can help the researchers in TF-binding site identification and transcriptional regulation mechanism interpretation of genes.
Publisher
Cold Spring Harbor Laboratory