Affiliation:
1. Weill Cornell Medicine
2. New York-Presbyterian Hospital, Weill Cornell Medical Center
3. Weill Cornell Medical College
4. University of Pennsylvania
Abstract
Abstract
Objective
While artificial intelligence (AI), particularly large language models (LLMs), offers significant potential for medicine, it raises critical concerns due to the possibility of generating factually incorrect information, leading to potential long-term risks and ethical issues. This review aims to provide a comprehensive overview of the faithfulness problem in existing research on AI in healthcare and medicine, with a focus on the analysis of the causes of unfaithful results, evaluation metrics, and mitigation methods.
Materials and Methods
Using PRISMA methodology, we sourced 5,061 records from five databases (PubMed, Scopus, IEEE Xplore, ACM Digital Library, Google Scholar) published between January 2018 to March 2023. We removed duplicates and screened records based on exclusion criteria.
Results
With 40 leaving articles, we conducted a systematic review of recent developments aimed at optimizing and evaluating factuality across a variety of generative medical AI approaches. These include knowledge-grounded LLMs, text-to-text generation, multimodality-to-text generation, and automatic medical fact-checking tasks.
Discussion
Current research investigating the factuality problem in medical AI is in its early stages. There are significant challenges related to data resources, backbone models, mitigation methods, and evaluation metrics. Promising opportunities exist for novel faithful medical AI research involving the adaptation of LLMs and prompt engineering.
Conclusion
This comprehensive review highlights the need for further research to address the issues of reliability and factuality in medical AI, serving as both a reference and inspiration for future research into the safe, ethical use of AI in medicine and healthcare.
Publisher
Research Square Platform LLC
Reference80 articles.
1. Artificial intelligence in healthcare;Yu K-H;Nat biomedical engineering,2018
2. High-performance medicine: the convergence of human and artificial intelligence;Topol EJ;Nat Med,2019
3. Ai in health and medicine;Rajpurkar P;Nat Med,2022
4. LeCun Y, Bengio Y, Hinton G (2015) Deep Learn Nat 521:436–444
5. Pre-trained language models in biomedical domain: A systematic survey;Wang B;arXiv preprint arXiv,2021
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献