Will we ever be able to accurately predict solubility?-Reference-Cited by-同舟云学术

Will we ever be able to accurately predict solubility?

Published:2024-03-18 Issue:1 Volume:11 Page:
ISSN:2052-4463
Container-title:Scientific Data
language:en
Short-container-title:Sci Data

Author:

Llompart P.,Minoletti C.,Baybekov S.,Horvath D.,Marcou G.^ORCID,Varnek A.^ORCID

Abstract

AbstractAccurate prediction of thermodynamic solubility by machine learning remains a challenge. Recent models often display good performances, but their reliability may be deceiving when used prospectively. This study investigates the origins of these discrepancies, following three directions: a historical perspective, an analysis of the aqueous solubility dataverse and data quality. We investigated over 20 years of published solubility datasets and models, highlighting overlooked datasets and the overlaps between popular sets. We benchmarked recently published models on a novel curated solubility dataset and report poor performances. We also propose a workflow to cure aqueous solubility data aiming at producing useful models for bench chemist. Our results demonstrate that some state-of-the-art models are not ready for public usage because they lack a well-defined applicability domain and overlook historical data sources. We report the impact of factors influencing the utility of the models: interlaboratory standard deviation, ionic state of the solute and data sources. The herein obtained models, and quality-assessed datasets are publicly available.

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41597-024-03105-6.pdf

Reference133 articles.

1. Kennedy, T. Managing the drug discovery/development interface. Drug Discov. Today 2, 436–444 (1997).

2. Kola, I. & Landis, J. Can the pharmaceutical industry reduce attrition rates? Nat. Rev. Drug Discov. 3, 711–716 (2004).

3. Millard, J., Alvarez-Núñez, F. & Yalkowsky, S. Solubilization by cosolvents. Establishing useful constants for the log-linear model. Int. J. Pharm. 245, 153–166 (2002).

4. Jouyban, A. & Abolghassemi Fakhree, M. A. Solubility prediction methods for drug/drug like molecules. Recent Pat. Chem. Eng. 1, 220–231 (2008).

5. van de Waterbeemd, H. Improving compound quality through in vitro and in silico physicochemical profiling. Chem. Biodivers. 6, 1760–1766 (2009).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identification of novel human 15-lipoxygenase-2 (h15-LOX-2) inhibitors using a virtual screening approach;2024-07-21

2. Evaluation of Machine Learning Models for Aqueous Solubility Prediction in Drug Discovery;2024-06-11

3. Evaluation of Machine Learning Models for Aqueous Solubility Prediction in Drug Discovery;2024 7th International Conference on Artificial Intelligence and Big Data (ICAIBD);2024-05-24