Performance of artificial intelligence chatbot as a source of patient information on anti-rheumatic drug use in pregnancy-Reference-Cited by-同舟云学术

Performance of artificial intelligence chatbot as a source of patient information on anti-rheumatic drug use in pregnancy

Published:2023-10-04 Issue:10 Volume:7 Page:651-655
ISSN:2602-2079
Container-title:Journal of Surgery and Medicine
language:
Short-container-title:J Surg Med

Author:

Oruçoğlu Nurdan^ORCID,Altunel Kılınç Elif^ORCID

Abstract

Background/Aim: Women with rheumatic and musculoskeletal disorders often discontinue using their medications prior to conception or during the few early weeks of pregnancy because drug use during pregnancy frequently results in anxiety. Pregnant women have reported seeking out health-related information from a variety of sources, particularly the Internet, in an attempt to ease their concerns about the use of such medications during pregnancy. The objective of this study was to evaluate the accuracy and completeness of health-related information concerning the use of anti-rheumatic medications during pregnancy as provided by Open Artificial Intelligence (AI's) Chat Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4, which are widely known AI tools. Methods: In this prospective cross-sectional study, the performances of OpenAI's ChatGPT versions 3.5 and 4 were assessed regarding health information concerning anti-rheumatic drugs during pregnancy using the 2016 European Union of Associations for Rheumatology (EULAR) guidelines as a reference. Fourteen queries from the guidelines were entered into both AI models. Responses were evaluated independently and rated by two evaluators using a predefined 6-point Likert-like scale (1 – completely incorrect to 6 – completely correct) and for completeness using a 3-point Likert-like scale (1 – incomplete to 3 – complete). Inter-rater reliability was evaluated using Cohen’s kappa statistic, and the differences in scores across ChatGPT versions were compared using the Mann–Whitney U test. Results: No statistically significant difference between the mean accuracy scores of GPT versions 3.5 and 4 (5 [1.17] versus 5.07 [1.26]; P=0.769), indicating the resulting scores were between nearly all accurate and correct for both models. Additionally, no statistically significant difference in the mean completeness scores of GPT 3.5 and GPT 4 (2.5 [0.51] vs 2.64 [0.49], P=0.541) was found, indicating scores between adequate and comprehensive for both models. Both models had similar total mean accuracy and completeness scores (3.75 [1.55] versus 3.86 [1.57]; P=0.717). In the GPT 3.5 model, hydroxychloroquine and Leflunomide received the highest full scores for both accuracy and completeness, while methotrexate, Sulfasalazine, Cyclophosphamide, Mycophenolate mofetil, and Tofacitinib received the highest total scores in the GPT 4 model. Nevertheless, for both models, one of the 14 drugs was scored as more incorrect than correct. Conclusions: When considering the safety and compatibility of anti-rheumatic medications during pregnancy, both ChatGPT versions 3.5 and 4 demonstrated satisfactory accuracy and completeness. On the other hand, the research revealed that the responses generated by ChatGPT also contained inaccurate information. Despite its good performance, ChatGPT should not be used as a standalone tool to make decisions about taking medications during pregnancy due to this AI tool’s limitations.

Publisher

SelSistem

Subject

General Engineering

Reference22 articles.

1. Cooper GS, Stroehla BC. The epidemiology of autoimmune diseases. Autoimmun Rev. 2003;2(3):119-25. doi: 10.1016/s1568-9972(03)00006-5.

2. Desai RJ, Huybrechts KF, Bateman BT, Hernandez-Diaz S, Mogun H, Gopalakrishnan C, et al. Brief Report: Patterns and Secular Trends in Use of Immunomodulatory Agents During Pregnancy in Women With Rheumatic Conditions. Arthritis Rheumatol. 2016;68(5):1183-9. doi: 10.1002/art.39521.

3. Grimes HA, Forster DA, Newton MS. Sources of information used by women during pregnancy to meet their information needs. Midwifery. 2014;30(1):e26-33. doi: 10.1016/j.midw.2013.10.007.

4. Serçekuş P, Değirmenciler B, Özkan S. Internet use by pregnant women seeking childbirth information. J Gynecol Obstet Hum Reprod. 2021;50(8):102144. doi: 10.1016/j.jogoh.2021.102144.

5. Bramham K, Soh MC, Nelson-Piercy C. Pregnancy and renal outcomes in lupus nephritis: an update and guide to management. Lupus. 2012;21(12):1271-83. doi: 10.1177/0961203312456893.