Affiliation:
1. Infinite Intelligence Pharma Beijing 100083 China
2. Shanghai Key Laboratory of New Drug Design State Key Laboratory of Bioreactor Engineering School of Pharmacy East China University of Science and Technology Shanghai 200237 China
3. College of Chemistry and Molecular Engineering Peking University Beijing 100871 China
Abstract
A deep learning‐powered VS approach combined with two free docking programs are proposed and evaluated for screening an ultra‐large compound library to obtain diverse potential active compounds rapidly and efficiently. We found that it is a practical and transferable strategy to significantly reduce computational cost.BackgroundMolecular docking‐based virtual screening (VS) aims to choose ligands with potential pharmacological activities from millions or even billions of molecules. This process could significantly cut down the number of compounds that need to be experimentally tested. However, during the docking calculation, many molecules have low affinity for a particular protein target, which waste a lot of computational resources.MethodsWe implemented a fast and practical molecular screening approach called DL‐DockVS (deep learning dock virtual screening) by using deep learning models (regression and classification models) to learn the outcomes of pipelined docking programs step‐by‐step.ResultsIn this study, we showed that this approach could successfully weed out compounds with poor docking scores while keeping compounds with potentially high docking scores against 10 DUD‐E protein targets. A self‐built dataset of about 1.9 million molecules was used to further verify DL‐DockVS, yielding good results in terms of recall rate, active compounds enrichment factor and runtime speed.ConclusionsWe comprehensively evaluate the practicality and effectiveness of DL‐DockVS against 10 protein targets. Due to the improvements of runtime and maintained success rate, it would be a useful and promising approach to screen ultra‐large compound libraries in the age of big data. It is also very convenient for researchers to make a well‐trained model of one specific target for predicting other chemical libraries and high docking‐score molecules without docking computation again.
Subject
Applied Mathematics,Computer Science Applications,Biochemistry, Genetics and Molecular Biology (miscellaneous),Modeling and Simulation