Author:
Feng Bin,Liu Zequn,Huang Nanlan,Xiao Zhiping,Zhang Haomiao,Mirzoyan Srbuhi,Xu Hanwen,Hao Jiaran,Xu Yinghui,Zhang Ming,Wang Sheng
Abstract
AbstractCompound bioactivity plays an important role in different stages of drug development and discovery. Existing machine learning approaches have poor generalization ability in compound bioactivity prediction due to the small number of compounds in each assay and incompatible measurements among assays. Here, we propose ActFound, a foundation model for bioactivity prediction trained on 2.3 million experimentally-measured bioactivity compounds and 50, 869 assays from ChEMBL and BindingDB. The key idea of ActFound is to employ pairwise learning to learn the relative value differences between two compounds within the same assay to circumvent the incompatibility among assays. ActFound further exploits meta-learning to jointly optimize the model from all assays. On six real-world bioactivity datasets, ActFound demonstrates accurate in-domain prediction and strong generalization across datasets, assay types, and molecular scaffolds. We also demonstrated that ActFound can be used as an accurate alternative to the leading computational chemistry software FEP+(OPLS4) by achieving comparable performance when only using a few data points for fine-tuning. The promising results of ActFound indicate that ActFound can be an effective foundation model for a wide range of tasks in compound bioactivity prediction, paving the path for machine learning-based drug development and discovery.
Publisher
Cold Spring Harbor Laboratory
Reference62 articles.
1. First fully-automated ai/ml virtual screening cascade implemented at a drug discovery centre in africa;Nature Communications,2023
2. Lin, X. , Li, X. & Lin, X. A review on applications of computational methods in drug screening and design. Molecules 25 (2020). URL https://api.semanticscholar.org/CorpusID:214601719.
3. Comparative study between deep learning and qsar classifications for tnbc inhibitors and novel gpcr agonist discovery;Scientific reports,2020
4. Deep learning for drug repurposing: Methods, databases, and applications;Wiley interdisciplinary reviews: Computational molecular science,2022
5. Machine learning in drug discovery: a review;Artificial Intelligence Review,2022