Author:
Zhou Haosheng,Lin Wei,Labra Sergio R.,Lipton Stuart A.,Schork Nicholas J.,Rangan Aaditya V.
Abstract
AbstractMany traditional methods for analyzing gene-gene relationships focus on positive and negative correlations, both of which are a kind of ‘symmetric’ relationship. However, genes can also exhibit ‘asymmetric’ relationships, such as ‘if-then’ relationships used in boolean circuits. In this paper we develop a very general method that can be used to detect biclusters within gene-expression data that involve subsets of genes which are enriched for these ‘boolean-asymmetric’ relationships (BARs). These BAR-biclusters can correspond to heterogeneity that is driven by asymmetric gene-gene interactions, rather than more standard symmetric interactions. We apply our method to a single-cell RNA-sequencing data-set, demonstrating that the statistically-significant BAR-biclusters indeed contain additional information not present within more traditional ‘boolean-symmetric’-biclusters. For example, the BAR-biclusters involve different subsets of cells, and highlight different gene-pathways within the data-set. Moreover, by combining the boolean-asymmetric- and boolean-symmetric-signals, one can build linear classifiers which outperform those built using only traditional boolean-symmetric signals.Author summaryOne important step when analyzing large data-sets is to search for ‘heterogeneity’, i.e., subsets of the data that exhibit special characteristics. In the context of gene-expression data, this heterogeneity can take the form of a ‘bicluster’: a subset of samples across which certain genes interact in a special way. Traditionally, strategies for detecting biclusters have focused on gene-gene relationships such as correlations and anti-correlations; that is to say, gene-subsets that act in concert. In this paper we discuss a simple strategy for detecting biclusters that exhibit more general gene-gene relationships, such as one gene serving as a prerequisite for the expression of another, or a pair of genes which mutually exclude one another. Using a data-set from a study of Alzheimer’s disease, we demonstrate that these more general biclusters can be quite statistically significant, and can contain novel information not readily found in more traditional biclusters.
Publisher
Cold Spring Harbor Laboratory
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献