Combining entity co-occurrence with specialized word embeddings to measure entity relation in Alzheimer’s disease-Reference-Cited by-同舟云学术

Combining entity co-occurrence with specialized word embeddings to measure entity relation in Alzheimer’s disease

Published:2019-12 Issue:S5 Volume:19 Page:
ISSN:1472-6947
Container-title:BMC Medical Informatics and Decision Making
language:en
Short-container-title:BMC Med Inform Decis Mak

Author:

Heo Go Eun,Xie Qing,Song Min,Lee Jeong-Hoon

Abstract

Abstract Background Extracting useful information from biomedical literature plays an important role in the development of modern medicine. In natural language processing, there have been rigorous attempts to find meaningful relationships between entities automatically by co-occurrence-based methods. It has been increasingly important to understand whether relationships exist, and if so how strong, between any two entities extracted from a large number of texts. One of the defining methods is to measure semantic similarity and relatedness between two entities. Methods We propose a hybrid ranking method that combines a co-occurrence approach considering both direct and indirect entity pair relationship with specialized word embeddings for measuring the relatedness of two entities. Results We evaluate the proposed ranking method comparatively with other well-known methods such as co-occurrence, Word2Vec, COALS (Correlated Occurrence Analog to Lexical Semantics), and random indexing by calculating top-ranked entities related to Alzheimer’s disease. In addition, we analyze gene, pathway, and gene–phenotype relationships. Overall, the proposed method tends to find more hidden relationships than the other methods. Conclusion Our proposed method is able to select more useful related entities that not only highly co-occur but also have more indirect relations for the target entity. In pathway analysis, our proposed method shows superior performance at identifying (functional) cross clustering and higher-level pathways. Our proposed method, resulting from phenotype analysis, has an advantage in identifying the common genotype relating to phenotypes from biological literature.

Publisher

Springer Science and Business Media LLC

Subject

Health Informatics,Health Policy,Computer Science Applications

Link

http://link.springer.com/content/pdf/10.1186/s12911-019-0934-5.pdf

Reference34 articles.

1. Xing W, Qi J, Yuan X, Li L, Zhang X, Fu Y, Xiong S, Hu L, Peng J. A gene–phenotype relationship extraction pipeline from the biomedical literature using a representation learning approach. Bioinformatics. 2018;34(13):i386–94.

2. Klein D, Manning CD. Proceedings of the 41st annual meeting on Association for Computational Linguistics, volume 1. In: Accurate unlexicalized parsing; 2003. p. 423–30. Association for Computational Linguistics.

3. Fundel K, Küffner R, Zimmer R. RelEx—relation extraction using dependency parse trees. Bioinformatics. 2006;23(3):365–71.

4. Hunter L, Lu Z, Firby J, Baumgartner WA, Johnson HL, Ogren PV, Cohen KB. OpenDMAP: an open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression. BMC Bioinformatics. 2008;9(1):78.

5. Coulet A, Shah NH, Garten Y, Musen M, Altman RB. Using text to build semantic networks for pharmacogenomics. J Biomed Inform. 2010;43(6):1009–19.

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. RelCurator: a text mining-based curation system for extracting gene–phenotype relationships specific to neurodegenerative disorders;Genes & Genomics;2023-06-10

2. An automatic hypothesis generation for plausible linkage between xanthium and diabetes;Scientific Reports;2022-10-20

3. Multi-faceted semantic clustering with text-derived phenotypes;Computers in Biology and Medicine;2021-11

4. A Framework To Build A Causal Knowledge Graph for Chronic Diseases and Cancers By Discovering Semantic Associations from Biomedical Literature;2021 IEEE 9th International Conference on Healthcare Informatics (ICHI);2021-08

5. Multi-faceted Semantic Clustering With Text-derived Phenotypes;2021-05-29