Robustly interrogating machine learning-based scoring functions: what are they learning?-Reference-Cited by-同舟云学术

Robustly interrogating machine learning-based scoring functions: what are they learning?

Published:2023-11-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Durant Guy^ORCID,Boyles Fergus^ORCID,Birchall Kristian^ORCID,Marsden Brian^ORCID,Deane Charlotte M.^ORCID

Abstract

AbstractMotivationMachine learning-based scoring functions (MLBSFs) have been found to exhibit inconsistent performance on different benchmarks and be prone to learning dataset bias. For the field to develop MLBSFs that learn a generalisable understanding of physics, a more rigorous understanding of how they perform is required.ResultsIn this work, we compared the performance of a diverse set of popular MLBSFs (RFScore, SIGN, OnionNet-2, Pafnucy, and PointVS) to our proposed baseline models that can only learn dataset biases on a range of benchmarks. We found that these baseline models were competitive in accuracy to these MLBSFs in almost all proposed benchmarks, indicating these models only learn dataset biases. Our tests and provided platform, ToolBoxSF, will enable researchers to robustly interrogate MLBSF performance and determine the effect of dataset biases on their predictions.Availability and Implementation

https://github.com/guydurant/toolboxsf

Contactdeane@stats.ox.ac.ukSupplementary informationSupplementary data are available at Bioinformatics online.

Publisher

Cold Spring Harbor Laboratory

Reference64 articles.

1. A machine learning approach to predicting protein–ligand binding affinity with applications to molecular docking

2. Does a More Precise Chemical Description of Protein–Ligand Complexes Lead to More Accurate Prediction of Binding Affinity?

3. Protein-Based Virtual Screening of Chemical Databases. 1. Evaluation of Different Docking/Scoring Combinations

4. Learning from the ligand: using ligand-based features to improve binding affinity prediction;Bioinformatics,2020

5. Learning from Docked Ligands: Ligand-Based Features Rescue Structure-Based Scoring Functions When Trained on Docked Poses;Journal of Chemical Information and Modeling,2022

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Do Deep Learning Models for Co-Folding Learn the Physics of Protein-Ligand Interactions?;2024-06-04

2. T-cell receptor structures and predictive models reveal comparable alpha and beta chain structural diversity despite differing genetic complexity;2024-05-21

3. Modern machine‐learning for binding affinity estimation of protein–ligand complexes: Progress, opportunities, and challenges;WIREs Computational Molecular Science;2024-05

4. Learnt representations of proteins can be used for accurate prediction of small molecule binding sites on experimentally determined and predicted protein structures;Journal of Cheminformatics;2024-03-14