Development and Evaluation of a Natural Language Processing System for Curating a Trans-Thoracic Echocardiogram (TTE) Database-Reference-Cited by-同舟云学术

Development and Evaluation of a Natural Language Processing System for Curating a Trans-Thoracic Echocardiogram (TTE) Database

Published:2023-11-10 Issue:11 Volume:10 Page:1307
ISSN:2306-5354
Container-title:Bioengineering
language:en
Short-container-title:Bioengineering

Author:

Dong Tim¹^ORCID,Sunderland Nicholas¹,Nightingale Angus¹^ORCID,Fudulu Daniel P.¹,Chan Jeremy¹,Zhai Ben²,Freitas Alberto³^ORCID,Caputo Massimo¹,Dimagli Arnaldo¹,Mires Stuart¹,Wyatt Mike⁴,Benedetto Umberto¹,Angelini Gianni D.¹

Affiliation:

1. Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol BS2 8HW, UK

2. School of Computing Science, Northumbria University, Newcastle upon Tyne NE1 8ST, UK

3. Faculty of Medicine, University of Porto, 4100 Porto, Portugal

4. University Hospitals Bristol and Weston, Marlborough St, Bristol BS1 3NU, UK

Abstract

Background: Although electronic health records (EHR) provide useful insights into disease patterns and patient treatment optimisation, their reliance on unstructured data presents a difficulty. Echocardiography reports, which provide extensive pathology information for cardiovascular patients, are particularly challenging to extract and analyse, because of their narrative structure. Although natural language processing (NLP) has been utilised successfully in a variety of medical fields, it is not commonly used in echocardiography analysis. Objectives: To develop an NLP-based approach for extracting and categorising data from echocardiography reports by accurately converting continuous (e.g., LVOT VTI, AV VTI and TR Vmax) and discrete (e.g., regurgitation severity) outcomes in a semi-structured narrative format into a structured and categorised format, allowing for future research or clinical use. Methods: 135,062 Trans-Thoracic Echocardiogram (TTE) reports were derived from 146967 baseline echocardiogram reports and split into three cohorts: Training and Validation (n = 1075), Test Dataset (n = 98) and Application Dataset (n = 133,889). The NLP system was developed and was iteratively refined using medical expert knowledge. The system was used to curate a moderate-fidelity database from extractions of 133,889 reports. A hold-out validation set of 98 reports was blindly annotated and extracted by two clinicians for comparison with the NLP extraction. Agreement, discrimination, accuracy and calibration of outcome measure extractions were evaluated. Results: Continuous outcomes including LVOT VTI, AV VTI and TR Vmax exhibited perfect inter-rater reliability using intra-class correlation scores (ICC = 1.00, p < 0.05) alongside high R2 values, demonstrating an ideal alignment between the NLP system and clinicians. A good level (ICC = 0.75–0.9, p < 0.05) of inter-rater reliability was observed for outcomes such as LVOT Diam, Lateral MAPSE, Peak E Velocity, Lateral E’ Velocity, PV Vmax, Sinuses of Valsalva and Ascending Aorta diameters. Furthermore, the accuracy rate for discrete outcome measures was 91.38% in the confusion matrix analysis, indicating effective performance. Conclusions: The NLP-based technique yielded good results when it came to extracting and categorising data from echocardiography reports. The system demonstrated a high degree of agreement and concordance with clinician extractions. This study contributes to the effective use of semi-structured data by providing a useful tool for converting semi-structured text to a structured echo report that can be used for data management. Additional validation and implementation in healthcare settings can improve data availability and support research and clinical decision-making.

Publisher

MDPI AG

Subject

Bioengineering

Link

https://www.mdpi.com/2306-5354/10/11/1307/pdf

Reference42 articles.

1. Relevant Word Order Vectorization for Improved Natural Language Processing in Electronic Health Records;Thompson;Sci. Rep.,2019

2. Development and multicenter validation of chest X-ray radiography interpretations based on natural language processing;Zhang;Commun. Med.,2021

3. Validation of deep learning natural language processing algorithm for keyword extraction from pathology reports in electronic health records;Kim;Sci. Rep.,2020

4. Natural Language Processing markers in first episode psychosis and people at clinical high-risk;Morgan;Transl. Psychiatry,2021

5. Language impairment in adults with end-stage liver disease: Application of natural language processing towards patient-generated health records;Dickerson;NPJ Digit. Med.,2019

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identifying the Severity of Heart Valve Stenosis and Regurgitation Among a Diverse Population Within an Integrated Healthcare System: A Natural Language Processing Approach (Preprint);JMIR Cardio;2024-05-13

2. Evaluating Large Language Models in Echocardiography Reporting: Opportunities and Challenges;2024-01-20

3. An AI-based prognostic model for postoperative outcomes in non-cardiac surgical patients utilizing TEE: A conceptual study;DIGITAL HEALTH;2024-01

4. Natural Language Processing in Electronic Health Record Mining for Clinical Decision Support;2023 International Conference on Artificial Intelligence for Innovations in Healthcare Industries (ICAIIHI);2023-12-29