Affiliation:
1. Applied Biotechnology Research Center Baqiyatallah University of Medical Sciences Tehran Iran
Abstract
AbstractWhen designing a machine learning‐based scoring function, we access a limited number of protein‐ligand complexes with experimentally determined binding affinity values, representing only a fraction of all possible protein‐ligand complexes. Consequently, it is crucial to report a measure of confidence and quantify the uncertainty in the model's predictions during test time. Here, we adopt the conformal prediction technique to evaluate the confidence of a prediction for each member of the core set of the CASF 2016 benchmark. The conformal prediction technique requires a diverse ensemble of predictors for uncertainty estimation. To this end, we introduce ENS‐Score as an ensemble predictor, which includes 30 models with different protein‐ligand representation approaches and achieves Pearson's correlation of 0.842 on the core set of the CASF 2016 benchmark. Also, we comprehensively investigate the residual error of each data point to assess the normality behavior of the distribution of the residual errors and their correlation to the structural features of the ligands, such as hydrophobic interactions and halogen bonding. In the end, we provide a local host web application to facilitate the usage of ENS‐Score. All codes to repeat results are provided at https://github.com/miladrayka/ENS_Score.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献