Replicating a High-Impact Scientific Publication Using Systems of Large Language Models-Reference-Cited by-同舟云学术

Replicating a High-Impact Scientific Publication Using Systems of Large Language Models

Published:2024-04-12 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Bersenev Dennis,Yachie-Kinoshita Ayako^ORCID,Palaniappan Sucheendra K.^ORCID

Abstract

AbstractPublications focused on scientific discoveries derived from analyzing large biological datasets typically follow the cycle of hypothesis generation, experimentation, and data interpretation. The reproduction of findings from such papers is crucial for confirming the validity of the scientific, statistical, and computational methods employed in the study, and it also facilitates the foundation for new research. By employing a multi-agent system composed of Large Language Models (LLMs), including both text and code generation agents built on OpenAI’s platform, our study attempts to reproduce the methodology and findings of a high-impact publication that investigated the expression of viral-entry-associated genes using single-cell RNA sequencing (scRNA-seq). The LLM system was critically evaluated against the analysis results from the original study, highlighting the system’s ability to perform simple statistical analysis tasks and literature reviews to establish the purpose of the analyses. However, we also identified significant challenges in the system, such as nondeterminism in code generation, difficulties in data procurement, and the limitations presented by context length and bias from the model’s inherent training data. By addressing these challenges and expanding on the system’s capabilities, we intend to contribute to the goal of automating scientific research for efficiency, reproducibility, and transparency, and to drive the discussion on the role of AI in scientific discovery.

Publisher

Cold Spring Harbor Laboratory

Reference28 articles.

1. Bommasani, R. , et al.: On the opportunities and risks of Foundation models. https://arxiv.org/abs/2108.07258 (2022)

2. Fu, C. , et al.: MME: A comprehensive evaluation benchmark for multimodal large language models. https://arxiv.org/abs/2306.13394 (2023)

3. Rombach, R. , et al.: High-resolution image synthesis with Latent Diffusion Models. https://arxiv.org/abs/2112.10752 (2022)

4. Kramer, S. , et al.: Automated Scientific Discovery: From Equation Discovery to Autonomous Discovery Systems. https://arxiv.org/abs/2305.02251 (2023)

5. Langley, P.W. , Simon, H.A. , Bradshaw, G. , Zytkow, J.M. : Scientific Discovery: Computational Explorations of the Creative Process. MIT Press, ??? (1987)