Benchmarking compound activity prediction for real-world drug discovery applications-Reference-Cited by-同舟云学术

Benchmarking compound activity prediction for real-world drug discovery applications

Published:2024-06-04 Issue:1 Volume:7 Page:
ISSN:2399-3669
Container-title:Communications Chemistry
language:en
Short-container-title:Commun Chem

Author:

Tian Tingzhong^ORCID,Li Shuya^ORCID,Zhang Ziting,Chen Lin,Zou Ziheng,Zhao Dan^ORCID,Zeng Jianyang^ORCID

Abstract

AbstractIdentifying active compounds for target proteins is fundamental in early drug discovery. Recently, data-driven computational methods have demonstrated promising potential in predicting compound activities. However, there lacks a well-designed benchmark to comprehensively evaluate these methods from a practical perspective. To fill this gap, we propose a Compound Activity benchmark for Real-world Applications (CARA). Through carefully distinguishing assay types, designing train-test splitting schemes and selecting evaluation metrics, CARA can consider the biased distribution of current real-world compound activity data and avoid overestimation of model performances. We observed that although current models can make successful predictions for certain proportions of assays, their performances varied across different assays. In addition, evaluation of several few-shot training strategies demonstrated different performances related to task types. Overall, we provide a high-quality dataset for developing and evaluating compound activity prediction models, and the analyses in this work may inspire better applications of data-driven models in drug discovery.

Funder

National Natural Science Foundation of China

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s42004-024-01204-4.pdf

Reference76 articles.

1. Hughes, J. P., Rees, S., Kalindjian, S. B. & Philpott, K. L. Principles of early drug discovery. Br. J. Pharmacol. 162, 1239–1249 (2011).

2. Lim, S. et al. A review on compound-protein interaction prediction methods: data, format, representation and model. Comput. Struct. Biotechnol. J. 19, 1541–1556 (2021).

3. Kinch, M. S., Kraft, Z. & Schwartz, T. 2021 in review: FDA approvals of new medicines. Drug Discov. Today 27, 2057–2064 (2022).

4. Frye, L., Bhat, S., Akinsanya, K. & Abel, R. From computer-aided drug discovery to computer-driven drug discovery. Drug Discov. Today.: Technol. 39, 111–117 (2021).

5. Brown, D. G. & Boström, J. Where do recent small molecule clinical development candidates come from? J. Med. Chem. 61, 9442–9468 (2018).