Affiliation:
1. Department of Biochemistry and Molecular Biology, The University of Texas Medical Branch, Galveston, TX 77555, USA
Abstract
Abstract
Motivation
Inferring the direct relationships between biomolecules from omics datasets is essential for the understanding of biological and disease mechanisms. Gaussian Graphical Model (GGM) provides a fairly simple and accurate representation of these interactions. However, estimation of the associated interaction matrix using data is challenging due to a high number of measured molecules and a low number of samples.
Results
In this article, we use the thermodynamic entropy of the non-equilibrium system of molecules and the data-driven constraints among their expressions to derive an analytic formula for the interaction matrix of Gaussian models. Through a data simulation, we show that our method returns an improved estimation of the interaction matrix. Also, using the developed method, we estimate the interaction matrix associated with plasma proteome and construct the corresponding GGM and show that known NAFLD-related proteins like ADIPOQ, APOC, APOE, DPP4, CAT, GC, HP, CETP, SERPINA1, COLA1, PIGR, IGHD, SAA1 and FCGBP are among the top 15% most interacting proteins of the dataset.
Availability and implementation
The supplementary materials can be found in the following URL: http://dynamic-proteome.utmb.edu/PrecisionMatrixEstimater/PrecisionMatrixEstimater.aspx.
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
National Institute of General Medical Sciences
National Institutes of Health
Gulf Coast Consortia
NLM Training Program in Biomedical Informatics & Data Science
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Reference39 articles.
1. An algorithm for finding the distribution of maximal entropy;Agmon;J. Comput. Phys,1979
2. Singular value decomposition for genome-wide expression data processing and modeling;Alter;Proc. Natl. Acad. Sci. USA,2000
3. Model selection through sparse maximum likelihood estimation for multivariate Gaussian or binary data;Banerjee;J. Mach. Learn. Res,2008
4. Network biology: understanding the cell’s functional organization;Barabási;Nat. Rev. Genet,2004
5. Interactome under construction;Bonetta;Nature,2010
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献