Automatic, context-specific generation of Gene Ontology slims-Reference-Cited by-同舟云学术

Automatic, context-specific generation of Gene Ontology slims

Published:2010-10-07 Issue:1 Volume:11 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Davis Melissa J,Sehgal Muhammad Shoaib B,Ragan Mark A

Abstract

Abstract Background The use of ontologies to control vocabulary and structure annotation has added value to genome-scale data, and contributed to the capture and re-use of knowledge across research domains. Gene Ontology (GO) is widely used to capture detailed expert knowledge in genomic-scale datasets and as a consequence has grown to contain many terms, making it unwieldy for many applications. To increase its ease of manipulation and efficiency of use, subsets called GO slims are often created by collapsing terms upward into more general, high-level terms relevant to a particular context. Creation of a GO slim currently requires manipulation and editing of GO by an expert (or community) familiar with both the ontology and the biological context. Decisions about which terms to include are necessarily subjective, and the creation process itself and subsequent curation are time-consuming and largely manual. Results Here we present an objective framework for generating customised ontology slims for specific annotated datasets, exploiting information latent in the structure of the ontology graph and in the annotation data. This framework combines ontology engineering approaches, and a data-driven algorithm that draws on graph and information theory. We illustrate this method by application to GO, generating GO slims at different information thresholds, characterising their depth of semantics and demonstrating the resulting gains in statistical power. Conclusions Our GO slim creation pipeline is available for use in conjunction with any GO-annotated dataset, and creates dataset-specific, objectively defined slims. This method is fast and scalable for application to other biomedical ontologies.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-11-498.pdf

Reference33 articles.

1. GeneOntologyConsortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research 2003, (32 Database):D258-D261.

2. Wilson RJ, Goodman JL, Strelets VB, Gelbart W, Bitsoi L, Crosby M, Dirkmaat A, Emmert D, Gramates L, Falls K, et al.: FlyBase: Integration and improvements to query tools. Nucleic Acids Research 2008, (36 Database):D588-D593.

3. Bult C, Eppig J, Kadin J, Richardson J, Blake J, Airey M, Anagnostopoulos A, Babiuk R, Baldarelli R, Baya M, et al.: The Mouse Genome Database (MGD): Mouse biology and model systems. Nucleic Acids Research 2008, (36 Database):D724-D728.

4. Rogers A, Antoshechkin I, Bieri T, Blasiar D, Bastiani C, Canaran P, Chan J, Chen WJ, Davis P, Fernandes J, et al.: WormBase 2007. Nucleic Acids Research 2008, 36(Supplement 1):D612–617.

5. Huala E, Dickerman AW, Garcia-Hernandez M, Weems D, Reiser L, LaFond F, Hanley D, Kiphart D, Zhuang M, Huang W, et al.: The Arabidopsis Information Resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Research 2001, 29(1):102–105. 10.1093/nar/29.1.102

Cited by 33 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. vissE: a versatile tool to identify and visualise higher-order molecular phenotypes from functional enrichment analysis;BMC Bioinformatics;2024-02-08

2. Simplify enrichment: A bioconductor package for clustering and visualizing functional enrichment results;Genomics, Proteomics & Bioinformatics;2022-06

3. vissE: A versatile tool to identify and visualise higher-order molecular phenotypes from functional enrichment analysis;2022-03-07

4. Making Common Fund data more findable: catalyzing a data ecosystem;GigaScience;2022

5. Biological and Medical Ontologies: Disease Ontology (DO);Encyclopedia of Bioinformatics and Computational Biology;2019