Affiliation:
1. Department of Physics University of Science and Technology of China Hefei China
2. Department of Physics City University of Hong Kong Hong Kong China
3. Hefei National Laboratory, University of Science and Technology of China Hefei China
Abstract
AbstractPredicting protein‐ligand binding affinity is a crucial and challenging task in structure‐based drug discovery. With the accumulation of complex structures and binding affinity data, various machine‐learning scoring functions, particularly those based on deep learning, have been developed for this task, exhibiting superiority over their traditional counterparts. A fusion model sequentially connecting a graph neural network (GNN) and a convolutional neural network (CNN) to predict protein‐ligand binding affinity is proposed in this work. In this model, the intermediate outputs of the GNN layers, as supplementary descriptors of atomic chemical environments at different levels, are concatenated with the input features of CNN. The model demonstrates a noticeable improvement in performance on CASF‐2016 benchmark compared to its constituent CNN models. The generalization ability of the model is evaluated by setting a series of thresholds for ligand extended‐connectivity fingerprint similarity or protein sequence similarity between the training and test sets. Masking experiment reveals that model can capture key interaction regions. Furthermore, the fusion model is applied to a virtual screening task for a novel target, PI5P4Kα. The fusion strategy significantly improves the ability of the constituent CNN model to identify active compounds. This work offers a novel approach to enhancing the accuracy of deep learning models in predicting binding affinity through fusion strategies.
Funder
National Natural Science Foundation of China
City University of Hong Kong