Abstract
Background:
Artificial intelligence (AI)–based text generators, such as ChatGPT (OpenAI) and Google Bard (now Google Gemini), have demonstrated proficiency in predicting words and providing responses to various questions. However, their performance in answering clinical queries has not been well assessed. This comparative analysis aimed to assess the capabilities of ChatGPT and Google Gemini in addressing clinical questions.
Method:
Separate interactions with ChatGPT and Google Gemini were conducted to obtain responses to the clinical question, posed in a PICOT (patient, intervention, comparison, outcome, time) format. To ascertain the accuracy of the information provided by the AI chat bots, a thorough examination of full-text articles was conducted.
Results:
Although ChatGPT exhibited relative accuracy in generating bibliographic information, it displayed some inconsistencies in clinical content. Conversely, Google Gemini generated citations and summaries that were entirely fabricated.
Conclusion:
Despite generating responses that may appear credible, both AI-based tools exhibited factual inaccuracies, raising substantial concerns about their reliability as potential sources of clinical information.
[
J Nurs Educ
. 2024;63(8):556–559.]