Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study-Reference-Cited by-同舟云学术

Assessing the Performance of a New Artificial Intelligence–Driven Diagnostic Support Tool Using Medical Board Exam Simulations: Clinical Vignette Study

Published:2021-11-30 Issue:11 Volume:9 Page:e32507
ISSN:2291-9694
Container-title:JMIR Medical Informatics
language:en
Short-container-title:JMIR Med Inform

Author:

Ben-Shabat Niv^ORCID,Sloma Ariel^ORCID,Weizman Tomer^ORCID,Kiderman David^ORCID,Amital Howard^ORCID

Abstract

Background Diagnostic decision support systems (DDSS) are computer programs aimed to improve health care by supporting clinicians in the process of diagnostic decision-making. Previous studies on DDSS demonstrated their ability to enhance clinicians’ diagnostic skills, prevent diagnostic errors, and reduce hospitalization costs. Despite the potential benefits, their utilization in clinical practice is limited, emphasizing the need for new and improved products. Objective The aim of this study was to conduct a preliminary analysis of the diagnostic performance of “Kahun,” a new artificial intelligence-driven diagnostic tool. Methods Diagnostic performance was evaluated based on the program’s ability to “solve” clinical cases from the United States Medical Licensing Examination Step 2 Clinical Skills board exam simulations that were drawn from the case banks of 3 leading preparation companies. Each case included 3 expected differential diagnoses. The cases were entered into the Kahun platform by 3 blinded junior physicians. For each case, the presence and the rank of the correct diagnoses within the generated differential diagnoses list were recorded. Each diagnostic performance was measured in two ways: first, as diagnostic sensitivity, and second, as case-specific success rates that represent diagnostic comprehensiveness. Results The study included 91 clinical cases with 78 different chief complaints and a mean number of 38 (SD 8) findings for each case. The total number of expected diagnoses was 272, of which 174 were different (some appeared more than once). Of the 272 expected diagnoses, 231 (87.5%; 95% CI 76-99) diagnoses were suggested within the top 20 listed diagnoses, 209 (76.8%; 95% CI 66-87) were suggested within the top 10, and 168 (61.8%; 95% CI 52-71) within the top 5. The median rank of correct diagnoses was 3 (IQR 2-6). Of the 91 expected diagnoses, 62 (68%; 95% CI 59-78) of the cases were suggested within the top 20 listed diagnoses, 44 (48%; 95% CI 38-59) within the top 10, and 24 (26%; 95% CI 17-35) within the top 5. Of the 91 expected diagnoses, in 87 (96%; 95% CI 91-100), at least 2 out of 3 of the cases’ expected diagnoses were suggested within the top 20 listed diagnoses; 78 (86%; 95% CI 79-93) were suggested within the top 10; and 61 (67%; 95% CI 57-77) within the top 5. Conclusions The diagnostic support tool evaluated in this study demonstrated good diagnostic accuracy and comprehensiveness; it also had the ability to manage a wide range of clinical findings.

Publisher

JMIR Publications Inc.

Subject

Health Information Management,Health Informatics

Reference26 articles.

1. Performance of Four Computer-Based Diagnostic Systems

2. Effects of a Decision Support System on Physicians' Diagnostic Performance

3. Enhancement of Clinicians' Diagnostic Reasoning by Computer-Based Consultation

4. Judgment under Uncertainty: Heuristics and Biases

5. Cognitive biases associated with medical decisions: a systematic review

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. The Potential of Evidence-Based Clinical Intake Tools to Discover or Ground Prevalence of Symptoms Using Real-Life Digital Health Encounters: Retrospective Cohort Study;Journal of Medical Internet Research;2024-07-16

2. Can AI pass the written European Board Examination in Neurological Surgery? - Ethical and practical issues;Brain and Spine;2024

3. The potential of evidence based clinical intake tools to discover or ground correlations between disorders and symptoms (Preprint);2023-06-17

4. Response to Ben-Shabat et al.’s “Assessing data gathering of chatbot based symptom checkers – A clinical vignettes study”;International Journal of Medical Informatics;2023-02

5. Impact of uncertainty intolerance on clinical reasoning: A scoping review of the 21st‐century literature;Journal of Evaluation in Clinical Practice;2022-09-07