Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden-Reference-Cited by-同舟云学术

Development and Evaluation of a Digital Scribe: Conversation Summarization Pipeline for Emergency Department Counseling Sessions towards Reducing Documentation Burden

Published:2023-12-07 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Sezgin Emre^ORCID,Sirrianni Joseph,Kranz Kelly

Abstract

AbstractObjectiveWe present a proof-of-concept digital scribe system as an ED clinical conversation summarization pipeline and report its performance.Materials and MethodsWe use four pre-trained large language models to establish the digital scribe system: T5-small, T5-base, PEGASUS-PubMed, and BART-Large-CNN via zero-shot and fine-tuning approaches. Our dataset includes 100 referral conversations among ED clinicians and medical records. We report the ROUGE-1, ROUGE-2, and ROUGE-L to compare model performance. In addition, we annotated transcriptions to assess the quality of generated summaries.ResultsThe fine-tuned BART-Large-CNN model demonstrates greater performance in summarization tasks with the highest ROUGE scores (F1ROUGE-1=0.49, F1ROUGE-2=0.23, F1ROUGE-L=0.35) scores. In contrast, PEGASUS-PubMed lags notably (F1ROUGE-1=0.28, F1ROUGE-2=0.11, F1ROUGE-L=0.22). BART-Large-CNN’s performance decreases by more than 50% with the zero-shot approach. Annotations show that BART-Large-CNN performs 71.4% recall in identifying key information and a 67.7% accuracy rate.DiscussionThe BART-Large-CNN model demonstrates a high level of understanding of clinical dialogue structure, indicated by its performance with and without fine-tuning. Despite some instances of high recall, there is variability in the model’s performance, particularly in achieving consistent correctness, suggesting room for refinement. The model’s recall ability varies across different information categories.ConclusionThe study provides evidence towards the potential of AI-assisted tools in reducing clinical documentation burden. Future work is suggested on expanding the research scope with larger language models, and comparative analysis to measure documentation efforts and time.

Publisher

Cold Spring Harbor Laboratory

Reference65 articles.

1. Challenges of developing a digital scribe to reduce clinical documentation burden

2. Burnout in clinicians

3. Time Spent on Dedicated Patient Care and Documentation Tasks Before and After the Introduction of a Structured and Standardized Electronic Health Record

4. Burnout Syndrome among Emergency Department Staff: Prevalence and Associated Factors

5. Understanding the perceived role of electronic health records and workflow fragmentation on clinician documentation burden in emergency departments

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Improving Documentation Quality and Patient Interaction with AI: A Tool for Transforming Medical Records — An Experience Report;2024-07-09