The Efficacy of a Named Entity Recognition AI Model for Identifying Incidental Pulmonary Nodules in CT Reports-Reference-Cited by-同舟云学术

The Efficacy of a Named Entity Recognition AI Model for Identifying Incidental Pulmonary Nodules in CT Reports

Published:2024-07-27 Issue: Volume: Page:
ISSN:0846-5371
Container-title:Canadian Association of Radiologists Journal
language:en
Short-container-title:Can Assoc Radiol J

Author:

Mojibian Alireza¹,Jaskolka Jeff²³^ORCID,Ching Geoffrey⁴,Lee Brian¹,Myers Renelle⁵⁶⁷,Devine Chloe¹,Nicolaou Savvas¹⁵⁸,Parker William¹⁸⁹^ORCID

Affiliation:

1. Sapien Machine Learning Corporation (SapienML), Vancouver, BC, Canada

2. Radiology Department, Brampton Civic Hospital, Brampton, ON, Canada

3. Faculty of Medicine - Medical Imaging, University of Toronto, Toronto, ON, Canada

4. Schulich School of Medicine & Dentistry – University of Western Ontario, London, On, Canada

5. Faculty of Medicine, University of British Columbia, Vancouver, BC, Canada

6. BC Cancer Agency, Provincial Health Services Authority, Vancouver, BC, Canada

7. Respirology, Vancouver General Hospital, Vancouver, BC, Canada

8. Radiology Department, Vancouver General Hospital, Vancouver, BC, Canada

9. Radiology Department, Nanaimo Regional General Hospital, Nanaimo, BC, Canada

Abstract

Purpose: This study evaluates the efficacy of a commercial medical Named Entity Recognition (NER) model combined with a post-processing protocol in identifying incidental pulmonary nodules from CT reports. Methods: We analyzed 9165 anonymized CT reports and classified them into 3 categories: no nodules, nodules present, and nodules >6 mm. For each report, a generic medical NER model annotated entities and their relations, which were then filtered through inclusion/exclusion criteria selected to identify pulmonary nodules. Ground truth was established by manual review. To better understand the relationship between model performance and nodule prevalence, a subset of the data was programmatically balanced to equalize the number of reports in each class category. Results: In the unbalanced subset of the data, the model achieved a sensitivity of 97%, specificity of 99%, and accuracy of 99% in detecting pulmonary nodules mentioned in the reports. For nodules >6 mm, sensitivity was 95%, specificity was 100%, and accuracy was 100%. In the balanced subset of the data, sensitivity was 99%, specificity 96%, and accuracy 97% for nodule detection; for larger nodules, sensitivity was 94%, specificity 99%, and accuracy 98%. Conclusions: The NER model demonstrated high sensitivity and specificity in detecting pulmonary nodules reported in CT scans, including those >6 mm which are potentially clinically significant. The results were consistent across both unbalanced and balanced datasets indicating that the model performance is independent of nodule prevalence. Implementing this technology in hospital systems could automate the identification of at-risk patients, ensuring timely follow-up and potentially reducing missed or late-stage cancer diagnoses.

Publisher

SAGE Publications

Link

https://journals.sagepub.com/doi/pdf/10.1177/08465371241266785

Reference29 articles.

1. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries

2. Recent Trends in the Identification of Incidental Pulmonary Nodules

3. Clinical Concept-Based Radiology Reports Classification Pipeline for Lung Carcinoma

4. Use of Electronic Health Records in U.S. Hospitals

5. Investigating the impact of structured reporting on the linguistic standardization of radiology reports through natural language processing over a 10-year period