Toward an interactive article: integrating journals and biological databases-Reference-Cited by-同舟云学术

Toward an interactive article: integrating journals and biological databases

Published:2011-05-19 Issue:1 Volume:12 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Rangarajan Arun,Schedl Tim,Yook Karen,Chan Juancarlos,Haenel Stephen,Otis Lolly,Faelten Sharon,DePellegrin-Connelly Tracey,Isaacson Ruth,Skrzypek Marek S,Marygold Steven J,Stefancsik Raymund,Cherry J Michael,Sternberg Paul W,Müller Hans-Michael

Abstract

Abstract Background Journal articles and databases are two major modes of communication in the biological sciences, and thus integrating these critical resources is of urgent importance to increase the pace of discovery. Projects focused on bridging the gap between journals and databases have been on the rise over the last five years and have resulted in the development of automated tools that can recognize entities within a document and link those entities to a relevant database. Unfortunately, automated tools cannot resolve ambiguities that arise from one term being used to signify entities that are quite distinct from one another. Instead, resolving these ambiguities requires some manual oversight. Finding the right balance between the speed and portability of automation and the accuracy and flexibility of manual effort is a crucial goal to making text markup a successful venture. Results We have established a journal article mark-up pipeline that links GENETICS journal articles and the model organism database (MOD) WormBase. This pipeline uses a lexicon built with entities from the database as a first step. The entity markup pipeline results in links from over nine classes of objects including genes, proteins, alleles, phenotypes and anatomical terms. New entities and ambiguities are discovered and resolved by a database curator through a manual quality control (QC) step, along with help from authors via a web form that is provided to them by the journal. New entities discovered through this pipeline are immediately sent to an appropriate curator at the database. Ambiguous entities that do not automatically resolve to one link are resolved by hand ensuring an accurate link. This pipeline has been extended to other databases, namely Saccharomyces Genome Database (SGD) and FlyBase, and has been implemented in marking up a paper with links to multiple databases. Conclusions Our semi-automated pipeline hyperlinks articles published in GENETICS to model organism databases such as WormBase. Our pipeline results in interactive articles that are data rich with high accuracy. The use of a manual quality control step sets this pipeline apart from other hyperlinking tools and results in benefits to authors, journals, readers and databases.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-12-175.pdf

Reference20 articles.

1. Pafilis E, O'Donoghue SI, Jensen LJ, Horn H, Kuhn M, Brown NP, Schneider R: Reflect: augmented browsing for the life scientist. Nat Biotech 2009, 27(6):508–510. 10.1038/nbt0609-508

2. Textpresso search engine[http://www.textpresso.org]

3. Müller HM, Kenny EE, Sternberg PW: Textpresso: an ontology-based information retrieval and extraction system for biological literature. PLoS Biol 2004, 2(11):e309. 10.1371/journal.pbio.0020309

4. WormBase - the biology and genome of C. elegans[http://www.wormbase.org]

5. Saccharomyces Genome Database (SGD)[http://www.yeastgenome.org]

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Maximum Entropy (MaxEnt) as extreme distribution indicator of two Neotropical fruit fly parasitoids in irrigated drylands of Argentina;Bulletin of Entomological Research;2022-03-01

2. Harmonizing model organism data in the Alliance of Genome Resources;Genetics;2022-02-25

3. The Descent of Databases;Genetics;2021-03-01

4. Text mining meets community curation: a newly designed curation platform to improve author experience and participation at WormBase;Database;2020-01-01

5. Micropublication: incentivizing community curation and placing unpublished data into the public domain;Database;2018-01-01