Affiliation:
1. Department of Computing & Mathematics, Oral Roberts University, Tulsa, OK 74171, USA
Abstract
This study explores the application of fine-tuned large language models for predicting physicochemical properties, specifically focusing on Abraham model solute descriptors (E, S, A, B, V) and modified solvent parameters (e0, s0, a0, b0, v0). By leveraging ChemLLaMA, a specialized version of the LLaMA model for cheminformatics tasks, we developed the AbraLlama-Solvent and AbraLlama-Solute models using curated datasets of experimentally derived solute descriptors and solvent parameters. Our findings demonstrate that AbraLlama-Solvent and AbraLlama-Solute predict modified solvent parameters and solute descriptors with high accuracy, comparable to existing methods. The AbraLlama-Solvent model shows varying prediction accuracy across different solvents, influenced by their position within the chemical space, while the AbraLlama-Solute model consistently predicts solute descriptors with high accuracy. Both models are available as applications on Hugging Face, facilitating easy predictions from SMILES strings. This research highlights the potential of LLMs in chemistry applications, offering practical tools for solvent comparison and expanding the applicability of Abraham solvation equations to a broader range of organic solvents.
Reference20 articles.
1. A systematic study of key elements underlying molecular property prediction;Deng;Nat. Commun.,2023
2. Fine-Tuning ChemBERTa-2 for Aqueous Solubility Prediction;Lang;Ann. Chem. Sci. Res.,2023
3. Application of Transformers in Cheminformatics;Luong;J. Chem. Inf. Model.,2024
4. Lee, Y., Lang, A.S.I.D., Cai, D., and Wheat, S.R. (2024). The Role of Model Architecture and Scale in Predicting Molecular Properties: Insights from Fine-Tuning RoBERTa, BART, and LLaMA. arXiv.
5. Predicting Abraham model solvent coefficients;Bradley;Chem. Cent. J.,2015
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献