TF-Cluster: A pipeline for identifying functionally coordinated transcription factors via network decomposition of the shared coexpression connectivity matrix (SCCM)-Reference-Cited by-同舟云学术

TF-Cluster: A pipeline for identifying functionally coordinated transcription factors via network decomposition of the shared coexpression connectivity matrix (SCCM)

Published:2011-04-15 Issue:1 Volume:5 Page:
ISSN:1752-0509
Container-title:BMC Systems Biology
language:en
Short-container-title:BMC Syst Biol

Author:

Nie Jeff,Stewart Ron,Zhang Hang,Thomson James A,Ruan Fang,Cui Xiaoqi,Wei Hairong

Abstract

Abstract Background Identifying the key transcription factors (TFs) controlling a biological process is the first step toward a better understanding of underpinning regulatory mechanisms. However, due to the involvement of a large number of genes and complex interactions in gene regulatory networks, identifying TFs involved in a biological process remains particularly difficult. The challenges include: (1) Most eukaryotic genomes encode thousands of TFs, which are organized in gene families of various sizes and in many cases with poor sequence conservation, making it difficult to recognize TFs for a biological process; (2) Transcription usually involves several hundred genes that generate a combination of intrinsic noise from upstream signaling networks and lead to fluctuations in transcription; (3) A TF can function in different cell types or developmental stages. Currently, the methods available for identifying TFs involved in biological processes are still very scarce, and the development of novel, more powerful methods is desperately needed. Results We developed a computational pipeline called TF-Cluster for identifying functionally coordinated TFs in two steps: (1) Construction of a shared coexpression connectivity matrix (SCCM), in which each entry represents the number of shared coexpressed genes between two TFs. This sparse and symmetric matrix embodies a new concept of coexpression networks in which genes are associated in the context of other shared coexpressed genes; (2) Decomposition of the SCCM using a novel heuristic algorithm termed "Triple-Link", which searches the highest connectivity in the SCCM, and then uses two connected TF as a primer for growing a TF cluster with a number of linking criteria. We applied TF-Cluster to microarray data from human stem cells and Arabidopsis roots, and then demonstrated that many of the resulting TF clusters contain functionally coordinated TFs that, based on existing literature, accurately represent a biological process of interest. Conclusions TF-Cluster can be used to identify a set of TFs controlling a biological process of interest from gene expression data. Its high accuracy in recognizing true positive TFs involved in a biological process makes it extremely valuable in building core GRNs controlling a biological process. The pipeline implemented in Perl can be installed in various platforms.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Modeling and Simulation,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1752-0509-5-53.pdf

Reference118 articles.

1. Takahashi K, Tanabe K, Ohnuki M, Narita M, Ichisaka T, Tomoda K, Yamanaka S: Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell. 2007, 131 (5): 861-872. 10.1016/j.cell.2007.11.019

2. Yu J, Vodyanik MA, Smuga-Otto K, Antosiewicz-Bourget J, Frane JL, Tian S, Nie J, Jonsdottir GA, Ruotti V, Stewart R, et al.: Induced pluripotent stem cell lines derived from human somatic cells. Science. 2007, 318 (5858): 1917-1920. 10.1126/science.1151526

3. Zhou Q, Brown J, Kanarek A, Rajagopal J, Melton DA: In vivo reprogramming of adult pancreatic exocrine cells to beta-cells. Nature. 2008, 455 (7213): 627-632. 10.1038/nature07314

4. Mittler R, Blumwald E: Genetic engineering for modern agriculture: challenges and perspectives. Annu Rev Plant Biol. 2010, 61: 443-462. 10.1146/annurev-arplant-042809-112116

5. Cui X, Wang T, Chen HS, Busov V, Wei H: TF-finder: a software package for identifying transcription factors involved in biological processes using microarray data and existing knowledge base. BMC Bioinformatics. 2010, 11: 425- 10.1186/1471-2105-11-425

Cited by 27 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Uncovering co-regulatory modules and gene regulatory networks in the heart through machine learning-based analysis of large-scale epigenomic data;Computers in Biology and Medicine;2024-03

2. Regulation of regeneration in Arabidopsis thaliana;aBIOTECH;2023-11-22

3. Comparative transcriptomic screen identifies expression of key genes involved in pattern-triggered immunity induced by salicylic acid in strawberry;Horticulture, Environment, and Biotechnology;2023-07-21

4. Systems-level transcriptional regulation ofCaenorhabditis elegansmetabolism;2022-11-09

5. Network and epigenetic characterization of subsets of genes specifically expressed in maize bundle sheath cells;Computational and Structural Biotechnology Journal;2022