Evaluation of Machine Learning Models for Aqueous Solubility Prediction in Drug Discovery-Reference-Cited by-同舟云学术

Evaluation of Machine Learning Models for Aqueous Solubility Prediction in Drug Discovery

Published:2024-06-11 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Xue Nian,Zhang Yuzhu,Liu Sensen

Abstract

AbstractDetermining the aqueous solubility of the chemical compound is of great importancein-silicodrug discovery. However, correctly and rapidly predicting the aqueous solubility remains a challenging task. This paper explores and evaluates the predictability of multiple machine learning models in the aqueous solubility of compounds. Specifically, we apply a series of machine learning algorithms, including Random Forest, XG-Boost, LightGBM, and CatBoost, on a well-established aqueous solubility dataset (i. e., the Huuskonen dataset) of over 1200 compounds. Experimental results show that even traditional machine learning algorithms can achieve satisfactory performance with high accuracy. In addition, our investigation goes beyond mere prediction accuracy, delving into the interpretability of models to identify key features and understand the molecular properties that influence the predicted outcomes. This study sheds light on the ability to use machine learning approaches to predict compound solubility, significantly shortening the time that researchers spend on new drug discovery.

Publisher

Cold Spring Harbor Laboratory

Reference25 articles.

1. Evaluation of deep learning architectures for aqueous solubility prediction;ACS omega,2022

2. When poor solubility becomes an issue: From early stage to proof of concept

3. M. Mahapatra and M. Karuppasamy , “Fundamental considerations in drug design,” Computer Aided Drug Design (CADD): From Ligand-Based Methods to Structure-Based Approaches., 2022.

4. High throughput solubility measurement in drug discovery and development

5. D. Letinski , A. Redman , and H. e. a. Birch , “Inter-laboratory comparison of water solubility methods applied to difficult-to-test substances,” BMC Chemistry, 2021.