Abstract
AbstractBackgroundComputational biologists investigate gene expression time-series data using estimation, clustering, alignment, and enrichment methods to make biological sense of the data and provide compelling visualization. While there is an abundance of microarray and RNA-seq data available, interpreting the data while capturing the dynamism of a time-course experiment remains a difficult challenge. Advancements in RNA-seq technologies have allowed us to collect extensive profiles of diverse developmental processes but also requires additional methods for analysis and data integration to capture the increased dynamism. An approach that can both capture the dynamism and direction of change in a time-course experiment in a holistic manner and simultaneously identify which biological pathways are significantly altered is necessary for the interpretation of systems biology data. In addition, there is a need for a method to evaluate the viability of model organisms across different treatments and conditions. By comparing effects of a specific treatment (e.g., a drug) on the target pathway between multiple species and determining pathways with a similar response to biological cues between organisms, we can determine the best animal model for that treatment for future studies.MethodsHere, we present Dynamic Impact Approach with Normalization (DIA-norm), a dynamic pathway analysis tool for the analysis of time-course data without unsupervised dimensionality reduction. We analyzed five datasets of mesenchymal stem cells retrieved from the Gene Expression Omnibus data repository (3 human, 1 mouse cell line, 1 pig) which were differentiated in vitro towards adipogenesis. In the first step, DIA-norm calculated an impact and flux score for each biological term using p-value and fold change. In the second step, these scores were normalized and interpolated using cubic spline. Cross-correlation was then performed between all the data sets with r≥0.6 as a benchmark for high correlation as r = 0.7 is the limit of experimental reproducibility.ResultsDIA-norm predicted that the pig was a better model for humans than a mouse for the study of adipogenesis. The pig model had a higher number of correlating pathways with humans (64.5 to 30.5) and higher average correlation (r = 0.51 vs r = 0.46) as compared to mouse model vs human. While not a definitive conclusion, the results are in accordance with prior phylogenetic and disease studies in which pigs are a good model for studying humans, specifically regarding obesity. In addition, DIA-norm identified a larger number of biologically important pathways (approximately 2x number of pathways) versus a comparable enrichment analysis tool, DAVID. DIA-norm also identified some possible pathways of interests for adipogenesis, namely, nitrogen metabolism (r = 0.86), where there is little to no existing literature.ConclusionDIA-norm captured 80+% of biological important pathways and achieved high pathway correlation between species for the vast majority of important adipogenesis pathways. DIA-norm can be used for both time-series pathway analysis and the determination of a model organism. Our findings indicate that DIA-norm can be used to study the effect of any treatment, including drugs, on specific pathways between multiple species to determine the best animal model for that treatment for future studies. The reliability of DIA-norm to provide biological insights compared to enrichment approach tools has been demonstrated in the selected transcriptomic studies by identifying a higher number of total and biologically relevant pathways. DIA-norm’s final advantage was its easily interpretable graphical outputs that aid in visualizing dynamic changes in expression.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献