Affiliation:
1. Laboratory for Atomistic and Molecular Mechanics (LAMM), Massachusetts Institute of Technology , 77 Massachusetts Avenue, Cambridge, MA 02139 ; , 77 Massachusetts Avenue, Cambridge, MA 02139
2. Center for Computational Science and Engineering, Schwarzman College of Computing, Massachusetts Institute of Technology , 77 Massachusetts Avenue, Cambridge, MA 02139 ; , 77 Massachusetts Avenue, Cambridge, MA 02139
Abstract
Abstract
For centuries, researchers have sought out ways to connect disparate areas of knowledge. While early scholars (Galileo, da Vinci, etc.) were experts across fields, specialization took hold later. With the advent of Artificial Intelligence, we can now explore relationships across areas (e.g., mechanics-biology) or disparate domains (e.g., failure mechanics-art). To achieve this, we use a fine-tuned large language model (LLM), here for a subset of knowledge in multiscale materials failure. The approach includes the use of a general-purpose LLM to distill question-answer pairs from raw sources followed by LLM fine-tuning. The resulting MechGPT LLM foundation model is used in a series of computational experiments to explore its capacity for knowledge retrieval, various language tasks, hypothesis generation, and connecting knowledge across disparate areas. While the model has some ability to recall knowledge from training, we find that LLMs are particularly useful for extracting structural insights through Ontological Knowledge Graphs. These interpretable graph structures provide explanatory insights, frameworks for new research questions, and visual representations of knowledge that also can be used in retrieval-augmented generation. Three versions of MechGPT are discussed, featuring different sizes from 13 × 109 to 70 × 109 parameters, and reaching context lengths of more than 10,000 tokens. This provides ample capacity for sophisticated retrieval augmented strategies, as well as agent-based modeling where multiple LLMs interact collaboratively and/or adversarially, the incorporation of new data from the literature or web searches, as well as multimodality.
Reference56 articles.
1. Language Models Are Unsupervised Multitask Learners,2023
2. Language Models Are Few-Shot Learners;Adv. Neural Inf. Process Syst.,2020
3. Generative Pretrained Autoregressive Transformer Graph Neural Network Applied to the Analysis and Discovery of Novel Proteins,2023
4. Models of Natural Language Understanding;Proc. Natl. Acad. Sci. U. S. A.,1995
5. LaMDA: Language Models for Dialog Applications,2022
Cited by
21 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献