Detecting hallucinations in large language models using semantic entropy-Reference-Cited by-同舟云学术

Detecting hallucinations in large language models using semantic entropy

Published:2024-06-19 Issue:8017 Volume:630 Page:625-630
ISSN:0028-0836
Container-title:Nature
language:en
Short-container-title:Nature

Author:

Farquhar Sebastian^ORCID,Kossen Jannik,Kuhn Lorenz,Gal Yarin^ORCID

Abstract

AbstractLarge language model (LLM) systems, such as ChatGPT1 or Gemini2, can show impressive reasoning and question-answering capabilities but often ‘hallucinate’ false outputs and unsubstantiated answers3,4. Answering unreliably or without the necessary information prevents adoption in diverse fields, with problems including fabrication of legal precedents5 or untrue facts in news articles6 and even posing a risk to human life in medical domains such as radiology7. Encouraging truthfulness through supervision or reinforcement has been only partially successful8. Researchers need a general method for detecting hallucinations in LLMs that works even with new and unseen questions to which humans might not know the answer. Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations. Our method addresses the fact that one idea can be expressed in many ways by computing uncertainty at the level of meaning rather than specific sequences of words. Our method works across datasets and tasks without a priori knowledge of the task, requires no task-specific data and robustly generalizes to new tasks not seen before. By detecting when a prompt is likely to produce a confabulation, our method helps users understand when they must take extra care with LLMs and opens up new possibilities for using LLMs that are otherwise prevented by their unreliability.

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41586-024-07421-0.pdf

Reference65 articles.

1. GPT-4 technical report. Preprint at https://arxiv.org/abs/2303.08774 (2023).

2. Gemini: a family of highly capable multimodal models. Preprint at https://arxiv.org/abs/2312.11805 (2023).

3. Xiao, Y. & Wang, W. Y. On hallucination and predictive uncertainty in conditional language generation. In Proc. 16th Conference of the European Chapter of the Association for Computational Linguistics 2734–2744 (Association for Computational Linguistics, 2021).

4. Rohrbach, A., Hendricks, L. A., Burns, K., Darrell, T. & Saenko, K. Object hallucination in image captioning. In Proc. 2018 Conference on Empirical Methods in Natural Language Processing (eds Riloff, E., Chiang, D., Hockenmaier, J. & Tsujii, J.) 4035–4045 (Association for Computational Linguistics, 2018).

5. Weiser, B. Lawyer who used ChatGPT faces penalty for made up citations. The New York Times (8 Jun 2023).

Cited by 8 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. ERG-AI: enhancing occupational ergonomics with uncertainty-aware ML and LLM feedback;Applied Intelligence;2024-09-10

2. A scoping review of large language model based approaches for information extraction from radiology reports;npj Digital Medicine;2024-08-24

3. Large language models in biomedicine and health: current research landscape and future directions;Journal of the American Medical Informatics Association;2024-08-22

4. BioImage.IO Chatbot: a community-driven AI assistant for integrative computational bioimaging;Nature Methods;2024-08

5. Large language models for overcoming language barriers in obstetric anaesthesia: a structured assessment;International Journal of Obstetric Anesthesia;2024-08