Unraveling the landscape of large language models: a systematic review and future perspectives-Reference-Cited by-同舟云学术

Unraveling the landscape of large language models: a systematic review and future perspectives

Published:2023-12-19 Issue: Volume: Page:
ISSN:2754-4214
Container-title:Journal of Electronic Business & Digital Economics
language:en
Short-container-title:JEBDE

Author:

Ding Qinxu,Ding Ding^ORCID,Wang Yue,Guan Chong^ORCID,Ding Bosheng

Abstract

PurposeThe rapid rise of large language models (LLMs) has propelled them to the forefront of applications in natural language processing (NLP). This paper aims to present a comprehensive examination of the research landscape in LLMs, providing an overview of the prevailing themes and topics within this dynamic domain.Design/methodology/approachDrawing from an extensive corpus of 198 records published between 1996 to 2023 from the relevant academic database encompassing journal articles, books, book chapters, conference papers and selected working papers, this study delves deep into the multifaceted world of LLM research. In this study, the authors employed the BERTopic algorithm, a recent advancement in topic modeling, to conduct a comprehensive analysis of the data after it had been meticulously cleaned and preprocessed. BERTopic leverages the power of transformer-based language models like bidirectional encoder representations from transformers (BERT) to generate more meaningful and coherent topics. This approach facilitates the identification of hidden patterns within the data, enabling authors to uncover valuable insights that might otherwise have remained obscure. The analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.FindingsThe analysis revealed four distinct clusters of topics in LLM research: “language and NLP”, “education and teaching”, “clinical and medical applications” and “speech and recognition techniques”. Each cluster embodies a unique aspect of LLM application and showcases the breadth of possibilities that LLM technology has to offer. In addition to presenting the research findings, this paper identifies key challenges and opportunities in the realm of LLMs. It underscores the necessity for further investigation in specific areas, including the paramount importance of addressing potential biases, transparency and explainability, data privacy and security, and responsible deployment of LLM technology.Practical implicationsThis classification offers practical guidance for researchers, developers, educators, and policymakers to focus efforts and resources. The study underscores the importance of addressing challenges in LLMs, including potential biases, transparency, data privacy, and responsible deployment. Policymakers can utilize this information to shape regulations, while developers can tailor technology development based on the diverse applications identified. The findings also emphasize the need for interdisciplinary collaboration and highlight ethical considerations, providing a roadmap for navigating the complex landscape of LLM research and applications.Originality/valueThis study stands out as the first to examine the evolution of LLMs across such a long time frame and across such diversified disciplines. It provides a unique perspective on the key areas of LLM research, highlighting the breadth and depth of LLM’s evolution.

Publisher

Emerald

Reference58 articles.

1. Angelov, D. (2020). Top2Vec: Distributed representations of topics. doi: 10.48550/arXiv.2008.09470.

2. The promise of large language models in health care;The Lancet,2023

3. A systematic review of wi-fi and machine learning integration with topic modeling techniques;Sensors (Basel, Switzerland),2022

4. Document clustering: TF-IDF approach;2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT),2016

5. Constitutional AI: Harmlessness from AI feedback,2022

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Leveraging LLMs for Efficient Topic Reviews;Applied Sciences;2024-08-30

2. Language discrepancies in the performance of generative artificial intelligence models: an examination of infectious disease queries in English and Arabic;BMC Infectious Diseases;2024-08-08

3. Large-scale analysis of the medical discourse on rheumatoid arthritis: complementing a socio-anthropologic analysis;2024-07-03

4. WisCompanion: Integrating the Socratic Method with ChatGPT-Based AI for Enhanced Explainability in Emotional Support for Older Adults;Lecture Notes in Computer Science;2024