Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data From Narratively Written Electronic Health Records-Reference-Cited by-同舟云学术

Deep Learning for Natural Language Processing in Urology: State-of-the-Art Automated Extraction of Detailed Pathologic Prostate Cancer Data From Narratively Written Electronic Health Records

Published:2018-12 Issue:2 Volume: Page:1-9
ISSN:2473-4276
Container-title:JCO Clinical Cancer Informatics
language:en
Short-container-title:JCO Clinical Cancer Informatics

Author:

Leyh-Bannurah Sami-Ramzi¹,Tian Zhe¹,Karakiewicz Pierre I.¹,Wolffgang Ulrich¹,Sauter Guido¹,Fisch Margit¹,Pehrke Dirk¹,Huland Hartwig¹,Graefen Markus¹,Budäus Lars¹

Affiliation:

1. Sami-Ramzi Leyh-Bannurah, Dirk Pehrke, Hartwig Huland, Markus Graefen, and Lars Budäus, Prostate Cancer Center Hamburg-Eppendorf; Sami-Ramzi Leyh-Bannurah, Margit Fisch, and Guido Sauter, University Medical Center Hamburg-Eppendorf, Hamburg; Ulrich Wolffgang, University of Muenster, Muenster, Germany; and Zhe Tian and Pierre I. Karakiewicz, University of Montreal Health Center, Montreal, Canada.

Abstract

Purpose Entering all information from narrative documentation for clinical research into databases is time consuming, costly, and nearly impossible. Even high-volume databases do not cover all patient characteristics and drawn results may be limited. A new viable automated solution is machine learning based on deep neural networks applied to natural language processing (NLP), extracting detailed information from narratively written (eg, pathologic radical prostatectomy [RP]) electronic health records (EHRs). Methods Within an RP pathologic database, 3,679 RP EHRs were randomly split into 70% training and 30% test data sets. Training EHRs were automatically annotated, providing a semiautomatically annotated corpus of narratively written pathologic reports with initially context-free gold standard encodings. Primary and secondary Gleason pattern, corresponding percentages, tumor stage, nodal stage, total volume, tumor volume and diameter, and surgical margin were variables of interest. Second, state-of-the-art NLP techniques were used to train an industry-standard language model for pathologic EHRs by transfer learning. Finally, accuracy of the named entity extractors was compared with the gold standard encodings. Results Agreement rates (95% confidence interval) for primary and secondary Gleason patterns each were 91.3% (89.4 to 93.0), corresponding to the following: Gleason percentages, 70.5% (67.6 to 73.3) and 80.9% (78.4 to 83.3); tumor stage, 99.3% (98.6 to 99.7); nodal stage, 98.7% (97.8 to 99.3); total volume, 98.3% (97.3 to 99.0); tumor volume, 93.3% (91.6 to 94.8); maximum diameter, 96.3% (94.9 to 97.3); and surgical margin, 98.7% (97.8 to 99.3). Cumulative agreement was 91.3%. Conclusion Our proposed NLP pipeline offers new abilities for precise and efficient data management from narrative documentation for clinical research. The scalable approach potentially allows the NLP pipeline to be generalized to other genitourinary EHRs, tumor entities, and other medical disciplines.

Publisher

American Society of Clinical Oncology (ASCO)

Subject

General Medicine

Link

https://ascopubs.org/doi/pdfdirect/10.1200/CCI.18.00080

Reference26 articles.

1. Local Therapy Improves Survival in Metastatic Prostate Cancer

2. Reducing the costs of phase III cardiovascular clinical trials

3. Natural language processing: an introduction

4. Comparing deep learning and concept extraction based methods for patient phenotyping from clinical narratives

5. Automated Extraction of VTE Events From Narrative Radiology Reports in Electronic Health Records

Cited by 29 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Potential Impact of Large Language Models on Doctor–Patient Communication: A Case Study in Prostate Cancer;Healthcare;2024-08-05

2. Validation of a Zero-shot Learning Natural Language Processing Tool to Facilitate Data Abstraction for Urologic Research;European Urology Focus;2024-03

3. Machine learning applications in detection and diagnosis of urology cancers: a systematic literature review;Neural Computing and Applications;2024-01-29

4. Extracting structured information from unstructured histopathology reports using generative pre‐trained transformer 4 (GPT‐4);The Journal of Pathology;2023-12-14

5. A novel watermarking framework for intellectual property protection of NLG APIs;Neurocomputing;2023-11