Affiliation:
1. Unité de Biologie Fonctionnelle et Adaptative Université Paris Cité, CNRS UMR8251, INSERM U1133 Paris F‐75013 France
Abstract
Determining the target‐bound conformation of a drug‐like molecule is a crucial step in drug design, as it affects the outcome of virtual screening (VS), and paves the way for hit‐to‐lead and lead optimization. While most docking programs usually manage to produce at least a near‐native pose for a bioactive molecule inside its binding pocket, their integrated classical scoring functions (SFs) generally fail to prioritize this pose. Many studies have been carried out to tackle this SF problem, offering multiple pose refinement and/or classification methods, albeit with limitations. This study presents a new support vector machine model for pose classification, called “ClassyPose”, which predicts the probability that a receptor‐bound ligand conformation could be near‐native, without any additional pose optimization step. Trained on protein‐ligand extended connectivity features extracted from over 21 600 crystal and docking poses of diverse ligands, this model outperformed other machine‐learning algorithms and three existing SFs in terms of docking power, identifying the native ligand pose as top‐ranked solution for more than 90% of entries in two test sets. It also achieved high specificity (above 0.96), and improved VS performance when used for pose selection. This efficient, user‐friendly tool and all related data are available at https://github.com/vktrannguyen/Classy_Pose.