An analysis of Watson vs. BARD vs. ChatGPT: The Jeopardy! Challenge-Reference-Cited by-同舟云学术

An analysis of Watson vs. BARD vs. ChatGPT: The Jeopardy! Challenge

Published:2023-08-30 Issue:3 Volume:44 Page:282-295
ISSN:0738-4602
Container-title:AI Magazine
language:en
Short-container-title:AI Magazine

Author:

O'Leary Daniel E.¹^ORCID

Affiliation:

1. University of Southern California Los Angeles California USA

Abstract

AbstractThe recently released BARD and ChatGPT have generated substantial interest from a range of researchers and institutions concerned about the impact on education, medicine, law and more. This paper uses questions from the Watson Jeopardy! Challenge to compare BARD, ChatGPT, and Watson. Using those, Jeopardy! questions, we find that for high confidence Watson questions the three systems perform with similar accuracy as Watson. We also find that both BARD and ChatGPT perform with the accuracy of a human expert and that the sets of their correct answers are rated highly similar using a Tanimoto similarity score. However, in addition, we find that both systems can change their solutions to the same input information on subsequent uses. When given the same Jeopardy! category and question multiple times, both BARD and ChatGPT can generate different and conflicting answers. As a result, the paper examines the characteristics of some of those questions that generate different answers to the same inputs. Finally, the paper discusses some of the implications of finding the different answers and the impact of the lack of reproducibility on testing such systems.

Publisher

Wiley

Subject

Artificial Intelligence

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/aaai.12118

Reference24 articles.

1. GPT-3: What’s it good for?

2. Dave V.2023.10 Insightful Google Bard Statistics & Facts to Know in 2023.https://www.enterpriseappstoday.com/stats/google‐bard‐statistics‐unfolding‐google‐bard.html

3. Duarte F.2023.Number of Chat GPT Users.https://explodingtopics.com/blog/chatgpt‐users

4. Elad B.2023.Google Bard Statistics – Unfolding Google Bard.https://www.enterpriseappstoday.com/stats/google‐bard‐statistics‐unfolding‐google‐bard.html

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Systematic Review and Comprehensive Analysis of Pioneering AI Chatbot Models from Education to Healthcare: ChatGPT, Bard, Llama, Ernie and Grok;Future Internet;2024-06-22

2. A Comparison of Numeric Assessments of Ideas From Two Large Language Models: With Implications for Validating and Choosing LLMs;IEEE Intelligent Systems;2024-05

3. Large Language Models and Applications: The Rebirth of Enterprise Knowledge Management and the Rise of Prompt Libraries;IEEE Intelligent Systems;2024-03

4. Exploring the Use of Generative AI in Education: Broadening the Scope;Lecture Notes in Computer Science;2024

5. The Rise and Design of Enterprise Large Language Models;IEEE Intelligent Systems;2024-01