An evaluation of ChatGPT and Bard (Gemini) in the context of biological knowledge retrieval-Reference-Cited by-同舟云学术

An evaluation of ChatGPT and Bard (Gemini) in the context of biological knowledge retrieval

Published:2024-06-01 Issue:6 Volume:6 Page:
ISSN:2516-8290
Container-title:Access Microbiology
language:en
Short-container-title:

Author:

Caspi Ron¹^ORCID,Karp Peter D.¹^ORCID

Affiliation:

1. SRI International, Menlo Park, CA 94025, USA

Abstract

ChatGPT and Bard (now called Gemini), two conversational AI models developed by OpenAI and Google AI, respectively, have garnered considerable attention for their ability to engage in natural language conversations and perform various language-related tasks. While the versatility of these chatbots in generating text and simulating human-like conversations is undeniable, we wanted to evaluate their effectiveness in retrieving biological knowledge for curation and research purposes. To do so we asked each chatbot a series of questions and scored their answers based on their quality. Out of a maximal score of 24, ChatGPT scored 5 and Bard scored 13. The encountered issues included missing information, incorrect answers, and instances where responses combine accurate and inaccurate details. Notably, both tools tend to fabricate references to scientific papers, undermining their usability. In light of these findings, we recommend that biologists continue to rely on traditional sources while periodically assessing the reliability of ChatGPT and Bard. As ChatGPT aptly suggested, for specific and up-to-date scientific information, established scientific journals, databases, and subject-matter experts remain the preferred avenues for trustworthy data.

Funder

SRI International

Publisher

Microbiology Society

Link

https://www.microbiologyresearch.org/content/journal/acmi/10.1099/acmi.0.000790.v3?crawler=true&mimetype=application/pdf

Reference30 articles.

1. ChatGPT Surpasses 1000 Publications on PubMed: Envisioning the Road Ahead

2. Comparative analysis of ChatGPT and Bard in answering pathology examination questions requiring image interpretation

3. Performance of ChatGPT vs. HuggingChat on OB-GYN Topics

4. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment

5. ChatGPT usage in the Reactome curation process