Author:
Restrepo David,Wu Chenwei,Vásquez-Venegas Constanza,Matos João,Gallifant Jack,Celi Leo Anthony,Bitterman Danielle S.,Nakayama Luis Filipe
Abstract
AbstractThe deployment of large language models (LLMs) in healthcare has demonstrated substantial potential for enhancing clinical decision-making, administrative efficiency, and patient outcomes. However, the underrepresentation of diverse groups in the development and application of these models can perpetuate biases, leading to inequitable healthcare delivery. This paper presents a comprehensive scientometric analysis of LLM research for healthcare, including data from January 1, 2021, to July 1, 2024. By analyzing metadata from PubMed and Dimensions, including author affiliations, countries, and funding sources, we assess the diversity of contributors to LLM research. Our findings highlight significant gender and geographic disparities, with a predominance of male authors and contributions primarily from high-income countries (HICs). We introduce a novel journal diversity index based on Gini diversity to measure the inclusiveness of scientific publications. Our results underscore the necessity for greater representation in order to ensure the equitable application of LLMs in healthcare. We propose actionable strategies to enhance diversity and inclusivity in artificial intelligence research, with the ultimate goal of fostering a more inclusive and equitable future in healthcare innovation.
Publisher
Cold Spring Harbor Laboratory
Reference38 articles.
1. Summary of ChatGPT-Related research and perspective towards the future of large language models
2. Generative text-guided 3d vision-language pretraining for unified medical image segmentation;arXiv preprint,2023
3. Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks;arXiv preprint,2024
4. Integrating medical imaging and clinical reports using multimodal deep learning for advanced disease analysis;arXiv preprint,2024
5. Exploration of attention mechanismenhanced deep learning models in the mining of medical textual data;arXiv preprint,2024