Abstract
AbstractBackgroundElucidating the Transcription Factors (TFs) that drive the gene expression changes in a given experiment is a common question asked by researchers. The existing methods rely on the predicted Transcription Factor Binding Site (TFBS) to model the changes in the motif activity. Such methods only work for TFs that have a motif and assume the TF binding profile is the same in all cell types.ResultsGiven the wealth of the ChIP-seq data available for a wide range of the TFs in various cell types, we propose that gene expression modeling can be done using ChIP-seq “signatures” directly, effectively skipping the motif finding and TFBS prediction steps. We presentxcore, an R package that allows TF activity modeling based on ChIP-seq signatures and the user's gene expression data. We also providexcoredataa companion data package that provides a collection of preprocessed ChIP-seq signatures. We demonstrate thatxcoreleads to biologically relevant predictions using transforming growth factor beta induced epithelial-mesenchymal transition time-courses, rinderpest infection time-courses, and embryonic stem cells differentiated to cardiomyocytes time-course profiled with Cap Analysis Gene Expression.Conclusionsxcoreprovides a simple analytical framework for gene expression modeling using linear models that can be easily incorporated into differential expression analysis pipelines. Taking advantage of public ChIP-seq databases,xcorecan identify meaningful molecular signatures and relevant ChIP-seq experiments.
Funder
Ministry of Education, Culture, Sport, Science and Technology of Japan for the RIKEN Center for Integrative Medical Sciences
RIKEN IMS Internship Program
European Regional Development Fund
Narodowe Centrum Nauki
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology