Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese-Reference-Cited by-同舟云学术

Combining string and phonetic similarity matching to identify misspelt names of drugs in medical records written in Portuguese

Published:2019-11 Issue:S1 Volume:10 Page:
ISSN:2041-1480
Container-title:Journal of Biomedical Semantics
language:en
Short-container-title:J Biomed Semant

Author:

Tissot Hegler,Dobson Richard

Abstract

Abstract Background There is an increasing amount of unstructured medical data that can be analysed for different purposes. However, information extraction from free text data may be particularly inefficient in the presence of spelling errors. Existing approaches use string similarity methods to search for valid words within a text, coupled with a supporting dictionary. However, they are not rich enough to encode both typing and phonetic misspellings. Results Experimental results showed a joint string and language-dependent phonetic similarity is more accurate than traditional string distance metrics when identifying misspelt names of drugs in a set of medical records written in Portuguese. Conclusion We present a hybrid approach to efficiently perform similarity match that overcomes the loss of information inherit from using either exact match search or string based similarity search methods.

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Health Informatics,Computer Science Applications,Information Systems

Link

http://link.springer.com/content/pdf/10.1186/s13326-019-0216-2.pdf

Reference24 articles.

1. Jellouli I, Mohajir ME. An ontology-based approach for web information extraction. In: 2011 Colloquium in Information Science and Technology. IEEE: 2011. https://doi.org/10.1109/cist.2011.6148583.

2. Pavel S, Euzenat J. Ontology Matching: State of the Art and Future Challenges. IEEE Trans Knowl Data Eng; 25(1):158–76. https://doi.org/10.1109/tkde.2011.253.

3. Karystianis G, Sheppard T, Dixon WG, Nenadic G. Modelling and extraction of variability in free-text medication prescriptions from an anonymised primary care electronic medical record research database. BMC Med Inf Dec Mak. 2016;16(1). https://doi.org/10.1186/s12911-016-0255-x.

4. Uzuner O, Solti I, Cadag E. Extracting medication information from clinical text. JAMIA. 2010; 17(5):514–8.

5. Jensen PB, Jensen LJ, Brunak S. Mining electronic health records: towards better research applications and clinical care. Nat Rev Genet. 2012; 13(6):395–405. https://doi.org/10.1038/nrg3208.

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Razy: A String Matching Algorithm for Automatic Analysis of Pathological Reports;Axioms;2022-10-12

2. Text Similarity Measurement Method and Application of Online Medical Community Based on Density Peak Clustering;Journal of Organizational and End User Computing;2022-05-12

3. Study on Named Entity Recognition in Chinese Literatures on Hypertension treatment;Proceedings of the 2021 International Conference on Intelligent Medicine and Health;2021-08-13

4. Improving Risk Assessment of Miscarriage During Pregnancy with Knowledge Graph Embeddings;Journal of Healthcare Informatics Research;2021-05-01

5. Identification of Synonyms Using Definition Similarities in Japanese Medical Device Adverse Event Terminology;Applied Sciences;2021-04-19