Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities-Reference-Cited by-同舟云学术

Hubs of knowledge: using the functional link structure in Biozon to mine for biologically significant entities

Published:2006-02-15 Issue:1 Volume:7 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Shafer Paul,Isganitis Timothy,Yona Golan

Abstract

Abstract Background Existing biological databases support a variety of queries such as keyword or definition search. However, they do not provide any measure of relevance for the instances reported, and result sets are usually sorted arbitrarily. Results We describe a system that builds upon the complex infrastructure of the Biozon database and applies methods similar to those of Google to rank documents that match queries. We explore different prominence models and study the spectral properties of the corresponding data graphs. We evaluate the information content of principal and non-principal eigenspaces, and test various scoring functions which combine contributions from multiple eigenspaces. We also test the effect of similarity data and other variations which are unique to the biological knowledge domain on the quality of the results. Query result sets are assessed using a probabilistic approach that measures the significance of coherence between directly connected nodes in the data graph. This model allows us, for the first time, to compare different prominence models quantitatively and effectively and to observe unique trends. Conclusion Our tests show that the ranked query results outperform unsorted results with respect to our significance measure and the top ranked entities are typically linked to many other biological entities. Our study resulted in a working ranking system of biological entities that was integrated into Biozon at http://biozon.org.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-7-71.pdf

Reference26 articles.

1. Bairoch A, Apweiler R: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res 2000, 28: 45–48. 10.1093/nar/28.1.45

2. George DG, Barker WC, Mewes HW, Pfeiffer F, Tsugita A: The PIR-International Protein Sequence Database. Nucleic Acids Research 1996, 24: 17–20. 10.1093/nar/24.1.17

3. Westbrook JD, Feng Z, Jain S, Bhat TN, Thanki N, Ravichandran V, Gilliland G, Bluhm W, Weissig H, Greer DS, Bourne PE, Berman HM: The Protein Data Bank: unifying the archive. Nucleic Acids Research 2002, 30: 245–248. 10.1093/nar/30.1.245

4. Benson DA, Boguski MS, Lipman DJ, Ostell J, Ouellette BFF, Rapp BA, Wheeler DL: GenBank. Nucleic Acids Research 1999, 27: 12–17. 10.1093/nar/27.1.12

5. Bader GD, Donaldson I, Wolting C, Ouellette BFF, Pawson T, Hogue CWV: BIND – The Biomolecular Interaction Network Database. Nucleic Acids Research 2001, 29: 242–245. 10.1093/nar/29.1.242

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Complex Network Based Computational Techniques for ‘Edgetic’ Modelling of Mutations Implicated with Cardiovascular Disease;Advances in Intelligent Systems and Computing;2016-09-07

2. An overlapping module identification method in protein-protein interaction networks;BMC Bioinformatics;2012-05-08

3. Using Medians to Generate Consensus Rankings for Biological Data;Lecture Notes in Computer Science;2011

4. The structure of collaboration in the Journal of Finance;Scientometrics;2010-06-10

5. XML-based approaches for the integration of heterogeneous bio-molecular data;BMC Bioinformatics;2009-10