Abstract
AbstractBiological networks are important for the analysis of human diseases, which summarize the regulatory interactions and other relationships between different molecules. Understanding and constructing networks for molecules, such as DNA, RNA and proteins, can help elucidate the mechanisms of complex biological systems. The Gaussian Graphical Models (GGMs) are popular tools for the estimation of biological networks. Nonetheless, reconstructing GGMs from high-dimensional datasets is still challenging. Current methods cannot handle the sparsity and high-dimensionality issues arising from datasets very well. Here we developed a new GGM, called the GR2D2 (Graphical R2-induced Dirichlet Decomposition) model, based on the R2D2 priors for linear models. Besides, we provided a data-augmented block Gibbs sampler algorithm. The R code is available at https://github.com/RavenGan/GR2D2. The GR2D2 estimator shows superior performance in estimating the precision matrices compared to existing techniques in various simulation settings. When the true precision matrix is sparse and of high dimension, the GR2D2 provides the estimates with smallest information divergence from the underlying truth. We also compare the GR2D2 estimator to the graphical horseshoe estimator in five cancer RNA-seq gene expression datasets grouped by three cancer types. Our results show that GR2D2 successfully identifies common cancer pathways and cancer-specific pathways for each dataset.
Publisher
Cold Spring Harbor Laboratory