O-089 Using ChatGPT to answer patient questions about fertility: the quality of information generated by a deep learning language model-Reference-Cited by-同舟云学术

O-089 Using ChatGPT to answer patient questions about fertility: the quality of information generated by a deep learning language model

Published:2023-06-01 Issue:Supplement_1 Volume:38 Page:
ISSN:0268-1161
Container-title:Human Reproduction
language:en
Short-container-title:

Author:

Beilby K¹,Hammarberg K²

Affiliation:

1. Monash University, Obstetrics & Gynaecology , Melbourne, Australia

2. Monash University, Global and Women's Health- Public Health and Preventive Medicine , Melbourne, Australia

Abstract

Abstract Study question What is the quality of information provided by ChatGPT when using common patient questions as prompts? Summary answer Overall, the quality of the information generated by ChatGPT was high with little evidence of commercial bias. What is known already People seeking fertility-related information rely on internet sources when deciding on reproductive planning and assisted conception. The quality of information within the commercial landscape of infertility treatment is poor. ChatGPT, a variant of Generative Pre-trained Transformer v3 (GPT-3), is a language model that uses deep learning to generate human-like text. Given prompts, it generates answers by predicting the next word in the sequence based on patterns learned from training data. The training data for GPT-3 is not curated, but a snapshot of the Web, which includes all kinds of information, including biases that may exist within sources. Study design, size, duration Ten common patient questions were used as prompts. Three questions related to fertility awareness (impact of female/ male age on fertility and fertile window in the menstrual cycle), one to the chance of success with IVF, one to elective egg freezing, one to the benefits of add-ons, one to PCOS and pregnancy, one to choosing a fertility clinic, and one to how many IVF cycles should be attempted. Participants/materials, setting, methods Two experts independently scored the quality of the information generated by the ChatGPT using a scoring matrix with a range of 0 to 7 where higher scores indicate higher quality. Text was rated against humanistic answers based on how well it corresponded (0-3), evidence of commercial bias or controversial claims (no = 1, yes=0), use of accurate proportions/ statistics and whether it was stated that medical advice should be sought (yes=1, no = 0). Main results and the role of chance The scores returned by the two experts were closely aligned with only one point difference for one of the answers. This discrepancy was resolved through discussion. While none of the answers received the maximum score of 7, 6/10 scored 5 or more and 3 received a score of 3-4. Only one answer, the answer to the question about the benefits of add-ons, scored less than 3. This was also the only question where the response had evidence of commercial bias and one of only two that made claims that could be considered controversial. Limitations, reasons for caution The scoring method used in this study has not been validated and is exploratory in nature as this area of evaluation is emerging. However, the use of expert evaluation is common when assessing the performance of machine learning models and often used to fine-tune their parameters and improve their performance. Wider implications of the findings It is known that people seeking fertility-related information rely heavily on online sources such as clinic websites, consumer advocacy organisations, patient support groups and social media. Our findings suggests that ChatGPT may be a useful tool for patients seeking factual and unbiased information regarding fertility and fertility treatment. Trial registration number Not applicable

Publisher

Oxford University Press (OUP)

Subject

Obstetrics and Gynecology,Rehabilitation,Reproductive Medicine

Link

https://academic.oup.com/humrep/article-pdf/38/Supplement_1/dead093.103/50786203/dead093.103.pdf

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ChatGPT’s Accuracy on Magnetic Resonance Imaging Basics: Characteristics and Limitations Depending on the Question Type;Diagnostics;2024-01-12

2. ChatGPT: a reliable fertility decision-making tool?;Human Reproduction;2024-01-10

3. A survey of consumer health question answering systems;AI Magazine;2023-11-27