GeneCOCOA: Detecting context-specific functions of individual genes using co-expression data

Author:

Zehr Simonida,Wolf Sebastian,Oellerich Thomas,Leisegang Matthias S.ORCID,Brandes Ralf P.,Schulz Marcel H.,Warwick TimothyORCID

Abstract

AbstractExtraction of meaningful biological insight from gene expression profiling often focuses on the identification of statistically enriched terms or pathways. These methods typically use gene sets as input data, and subsequently return overrepresented terms along with associated statistics describing their enrichment. This approach does not cater to analyses focused on a single gene-of-interest, particularly when the gene lacks prior functional characterization. To address this, we formulatedGeneCOCOA, a method which utilizes context-specific gene co-expression and curated functional gene sets, but focuses on a user-supplied gene-of-interest. The co-expression between the gene-of-interest and subsets of genes from functional groups (e.g. pathways, GO terms) is derived using linear regression, and resulting root-mean-square error values are compared against background values obtained from randomly selected genes. The resultingpvalues provide a statistical ranking of functional gene sets from any collection, along with their associated terms, based on their co-expression with the gene of interest in a manner specific to the context and experiment.GeneCOCOAthereby provides biological insight into both gene function, and putative regulatory mechanisms by which the expression of the gene-of-interest is controlled. Despite its relative simplicity,GeneCOCOAoutperforms similar methods in the accurate recall of known gene-disease associations.GeneCOCOAis formulated as an R package for ease-of-use, available athttps://github.com/si-ze/geneCOCOA.Author summaryUnderstanding the biological functions of different genes and their respective products is a key element of modern biological research. While one can examine the relative abundance of a gene product in transcriptomics data, this alone does not provide any clue to the biological relevance of the gene. Using a type of analysis called co-expression, it is possible to identify other genes which have similar patterns of regulation to a gene-of-interest, but again, this cannot tell you what a gene does. Genes whose function has previously been studied are often assembled into groups (e.g. pathways, ontologies), which can be used to annotate gene sets of interest. However, if a gene has not yet been characterized, it will not appear in these gene set enrichment analyses. Here, we propose a new method -GeneCOCOA- which uses co-expression of a single gene with genes in functional groups to identify which functional group a gene is most similar too, resulting in a putative function for the gene, even if it has not been studied before. We testedGeneCOCOAby using it to find gene-disease links which have already been scientifically studied, and showed thatGeneCOCOAcan do this more effectively than other available methods.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3