Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI)

Author:

Cadamuro Janne1ORCID,Cabitza Federico23,Debeljak Zeljko45,De Bruyne Sander6,Frans Glynis7ORCID,Perez Salomon Martin8,Ozdemir Habib9ORCID,Tolios Alexander10,Carobene Anna11,Padoan Andrea12ORCID

Affiliation:

1. Department of Laboratory Medicine , Paracelsus Medical University Salzburg , Salzburg , Austria

2. DISCo , Università degli Studi di Milano-Bicocca , Milano , Italy

3. IRCCS Istituto Ortopedico Galeazzi , Milan , Italy

4. Faculty of Medicine , Josip Juraj Strossmayer University of Osijek , Osijek , Croatia

5. Clinical Institute of Laboratory Diagnostics , University Hospital Center Osijek , Osijek , Croatia

6. Department of Laboratory Medicine , Ghent University Hospital , Ghent , Belgium

7. Department of Laboratory Medicine , University Hospitals Leuven, KU Leuven , Leuven , Belgium

8. Unidad de Bioquímica Clínica , Hospital Universitario Virgen Macarena , Sevilla , Spain

9. Department of Medical Biochemistry, Faculty of Medicine , Manisa Celal Bayar University , Manisa , Türkiye

10. Department of Transfusion Medicine and Cell Therapy , Medical University of Vienna , Vienna , Austria

11. IRCCS San Raffaele Scientific Institute , Milan , Italy

12. Department of Medicine (DIMED) , University of Padova , Padova , Italy

Abstract

Abstract Objectives ChatGPT, a tool based on natural language processing (NLP), is on everyone’s mind, and several potential applications in healthcare have been already proposed. However, since the ability of this tool to interpret laboratory test results has not yet been tested, the EFLM Working group on Artificial Intelligence (WG-AI) has set itself the task of closing this gap with a systematic approach. Methods WG-AI members generated 10 simulated laboratory reports of common parameters, which were then passed to ChatGPT for interpretation, according to reference intervals (RI) and units, using an optimized prompt. The results were subsequently evaluated independently by all WG-AI members with respect to relevance, correctness, helpfulness and safety. Results ChatGPT recognized all laboratory tests, it could detect if they deviated from the RI and gave a test-by-test as well as an overall interpretation. The interpretations were rather superficial, not always correct, and, only in some cases, judged coherently. The magnitude of the deviation from the RI seldom plays a role in the interpretation of laboratory tests, and artificial intelligence (AI) did not make any meaningful suggestion regarding follow-up diagnostics or further procedures in general. Conclusions ChatGPT in its current form, being not specifically trained on medical data or laboratory data in particular, may only be considered a tool capable of interpreting a laboratory report on a test-by-test basis at best, but not on the interpretation of an overall diagnostic picture. Future generations of similar AIs with medical ground truth training data might surely revolutionize current processes in healthcare, despite this implementation is not ready yet.

Publisher

Walter de Gruyter GmbH

Subject

Biochemistry (medical),Clinical Biochemistry,General Medicine

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3