Gene Set to Diseases (GS2D): disease enrichment analysis on human gene sets with literature data-Reference-Cited by-同舟云学术

Gene Set to Diseases (GS2D): disease enrichment analysis on human gene sets with literature data

Published:2016-10-30 Issue:1 Volume:2 Page:33
ISSN:2365-7154
Container-title:Genomics and Computational Biology
language:
Short-container-title:Genomics Comput Biol

Author:

Fontaine Jean Fred,Andrade-Navarro Miguel A

Abstract

Large sets of candidate genes derived from high-throughput biological experiments can be characterized by functional enrichment analysis. The analysis consists of comparing the functions of one gene set against that of a background gene set. Then, functions related to a significant number of genes in the gene set are expected to be relevant. Web tools offering disease enrichment analysis on gene sets are often based on gene-disease associations from manually curated or experimental data that is accurate but does not cover all diseases discussed in the literature. Using associations automatically derived from literature data could be a cost effective method to improve the coverage of diseases for enrichment analysis at comparable levels of accuracy. We have implemented a method named Gene set to Diseases, GS2D, as a web tool performing disease enrichment analysis on human protein coding gene sets. It uses an automatically built dataset of more than 63 thousand gene-disease associations defined as statistically significant co-occurrences of genes and diseases in annotations of biomedical citations from PubMed. The dataset covers more diseases for enrichment analysis than the largest comparable curated database, Comparative Toxicogenomics Database, and its performance compared favourably to similar approaches based on manually curated or experimental data. Graphical and programmatic interfaces are available at http://cbdm.uni-mainz.de/geneset2diseases.

Publisher

Kernel Press UG (haftungsbeschrankt)

Cited by 22 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Overlooked poor-quality patient samples in sequencing data impair reproducibility of published clinically relevant datasets;Genome Biology;2024-08-16

2. A Systematic Review of Lipid-Focused Cardiovascular Disease Research: Trends and Opportunities;Current Issues in Molecular Biology;2023-12-09

3. Injury-specific factors in the cerebrospinal fluid regulate astrocyte plasticity in the human brain;Nature Medicine;2023-12

4. Extreme gradient boosting machine learning algorithm identifies genome-wide genetic variants in prostate cancer risk prediction;2023-11-01

5. DisGeReExT: a knowledge discovery system for exploration of disease–gene associations through large-scale literature-wide analysis study;Knowledge and Information Systems;2023-04-10