Evaluating semantic similarity methods for comparison of text-derived phenotype profiles-Reference-Cited by-同舟云学术

Evaluating semantic similarity methods for comparison of text-derived phenotype profiles

Published:2022-02-05 Issue:1 Volume:22 Page:
ISSN:1472-6947
Container-title:BMC Medical Informatics and Decision Making
language:en
Short-container-title:BMC Med Inform Decis Mak

Author:

Slater Luke T.,Russell Sophie,Makepeace Silver,Carberry Alexander,Karwath Andreas,Williams John A.,Fanning Hilary,Ball Simon,Hoehndorf Robert,Gkoutos Georgios V.

Abstract

Abstract Background Semantic similarity is a valuable tool for analysis in biomedicine. When applied to phenotype profiles derived from clinical text, they have the capacity to enable and enhance ‘patient-like me’ analyses, automated coding, differential diagnosis, and outcome prediction. While a large body of work exists exploring the use of semantic similarity for multiple tasks, including protein interaction prediction, and rare disease differential diagnosis, there is less work exploring comparison of patient phenotype profiles for clinical tasks. Moreover, there are no experimental explorations of optimal parameters or better methods in the area. Methods We develop a platform for reproducible benchmarking and comparison of experimental conditions for patient phentoype similarity. Using the platform, we evaluate the task of ranking shared primary diagnosis from uncurated phenotype profiles derived from all text narrative associated with admissions in the medical information mart for intensive care (MIMIC-III). Results 300 semantic similarity configurations were evaluated, as well as one embedding-based approach. On average, measures that did not make use of an external information content measure performed slightly better, however the best-performing configurations when measured by area under receiver operating characteristic curve and Top Ten Accuracy used term-specificity and annotation-frequency measures. Conclusion We identified and interpreted the performance of a large number of semantic similarity configurations for the task of classifying diagnosis from text-derived phenotype profiles in one setting. We also provided a basis for further research on other settings and related tasks in the area.

Publisher

Springer Science and Business Media LLC

Subject

Health Informatics,Health Policy,Computer Science Applications

Link

https://link.springer.com/content/pdf/10.1186/s12911-022-01770-4.pdf

Reference43 articles.

1. Pereira L, Rijo R, Silva C, Martinho R. Text mining applied to electronic medical records: a literature review. Int J E-Health Med Commun (IJEHMC). 2015;6:1–18. https://doi.org/10.4018/IJEHMC.2015070101.

2. Dalianis H. Clinical text mining. Cham: Springer; 2018. https://doi.org/10.1007/978-3-319-78503-5.

3. Hoehndorf R, Schofield PN, Gkoutos GV. The role of ontologies in biological and biomedical research: a functional perspective. Br Bioinform. 2015;16(6):1069–80. https://doi.org/10.1093/bib/bbv011.

4. Gan M, Dou X, Jiang R. From ontology to semantic similarity: calculation of ontology-based semantic similarity. Sci World J. 2013;2013:793091. https://doi.org/10.1155/2013/793091.

5. Pesquita C, Faria D, Falcão AO, Lord P, Couto FM. Semantic similarity in biomedical ontologies. PLoS Comput Biol. 2009;5(7):1000443. https://doi.org/10.1371/journal.pcbi.1000443.