Author:
Pan Tony,Chockalingam Sriram P,Aluru Maneesha,Aluru Srinivas
Abstract
AbstractMotivationGene regulatory network (GRN) reconstruction from gene expression profiles is a compute- and data-intensive problem. Numerous methods based on diverse approaches including mutual information, random forests, Bayesian networks, correlation measures, as well as their transforms and filters such as data processing inequality, have been proposed. However, an effective GRN reconstruction method that performs well in all three aspects of computational efficiency, data size scalability, and output quality remains elusive. Simple techniques such as Pearson correlation are fast to compute but ignore indirect interactions, while more robust methods such as Bayesian networks are prohibitively time consuming to apply to tens of thousands of genes.ResultsWe developed MCP Score, a novel maximum-capacity-path based metric to quantify the relative strengths of direct and indirect gene-gene interactions. We further present MCPNet, an efficient, parallelized GRN reconstruction software based on MCP Score, to reconstruct networks in unsupervised and semi-supervised manners. Using synthetic and real S. cervisiae datasets as well as real A. thaliana datasets, we demonstrate that MCPNet produces better quality networks as measured by AUPR, is significantly faster than all other GRN inference software, and also scales well to tens of thousands of genes and hundreds of CPU cores. Thus, MCPNet represents a new GRN inferencing tool that simultaneously achieves quality, performance, and scalability requirements.AvailabilitySource code freely available for download at https://doi.org/10.5281/zenodo.6499748 and https://github.com/AluruLab/MCPNet, implemented in C++ and supported on Linux.Contactaluru@cc.gatech.eduSupplementary informationSupplementary data are available at Bioinformatics online.
Publisher
Cold Spring Harbor Laboratory