Affiliation:
1. School of Electronic Information, Northwestern Polytechnical University, Xi’an 710072, China
Abstract
Calculating semantic similarity is paramount in medical information processing, and it aims to assess the similarity of medical professional terminologies within medical databases. Natural language models based on Bidirectional Encoder Representations from Transformers(BERT) offer a novel approach to semantic representation for semantic similarity calculations. However, due to the specificity of medical terminologies, these models often struggle with accurately representing semantically similar medical terms, leading to inaccuracies in term representation and consequently affecting the accuracy of similarity calculations. To address this challenge, this study employs Chat Generative Pre-trained Transformer (ChatGPT) and contrastive loss during the training phase to adapt BERT, enhancing its semantic representation capabilities and improving the accuracy of similarity calculations. Specifically, we leverage ChatGPT-3.5 to generate semantically similar texts for medical professional terminologies, incorporating them as pseudo-labels into the model training process. Subsequently, contrastive loss is utilized to minimize the distance between relevant samples and maximize the distance between irrelevant samples, thereby enhancing the performance of medical similarity models, especially with limited training samples. Experimental validation is conducted on the open Electronic Health Record (OpenEHR) dataset, randomly divided into four groups to verify the effectiveness of the proposed methodology.
Funder
National Natural Science Foundation of China
Fundamental Research Funds for the Central Universities
Reference51 articles.
1. Verifying the feasibility of implementing semantic interoperability in different countries based on the openEHR approach: Comparative study of acute coronary syndrome registries;Min;JMIR Med. Inform.,2021
2. Performance of an openEHR based hospital information system;Kryszyn;Int. J. Med. Inform.,2022
3. Methodology for developing OpenEHR archetypes: A narrative literature review;Ferreira;J. Health Inform.,2023
4. Talebi, S., Tong, E., Li, A., Yamin, G., Zaharchuk, G., and Mofrad, M.R. (2024). Exploring the performance and explainability of fine-tuned BERT models for neuroradiology protocol assignment. BMC Med. Inform. Decis. Mak., 24.
5. Modeling EHR with the openEHR approach: An exploratory study in China;Min;BMC Med. Inform. Decis. Mak.,2018