AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins-Reference-Cited by-同舟云学术

AFSE: towards improving model generalization of deep graph learning of ligand bioactivities targeting GPCR proteins

Published:2022-03-25 Issue:3 Volume:23 Page:
ISSN:1467-5463
Container-title:Briefings in Bioinformatics
language:en
Short-container-title:

Author:

Yin Yueming¹,Hu Haifeng¹,Yang Zhen¹²,Jiang Feihu¹,Huang Yihe¹,Wu Jiansheng³⁴

Affiliation:

1. School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

2. National Engineering Research Center of Communications and Networking, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

3. School of Geographic and Biologic Information, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

4. Smart Health Big Data Analysis and Location Services Engineering Research Center of Jiangsu Province, Nanjing University of Posts and Telecommunications, Nanjing 210023, China

Abstract

Abstract Ligand molecules naturally constitute a graph structure. Recently, many excellent deep graph learning (DGL) methods have been proposed and used to model ligand bioactivities, which is critical for the virtual screening of drug hits from compound databases in interest. However, pharmacists can find that these well-trained DGL models usually are hard to achieve satisfying performance in real scenarios for virtual screening of drug candidates. The main challenges involve that the datasets for training models were small-sized and biased, and the inner active cliff cases would worsen model performance. These challenges would cause predictors to overfit the training data and have poor generalization in real virtual screening scenarios. Thus, we proposed a novel algorithm named adversarial feature subspace enhancement (AFSE). AFSE dynamically generates abundant representations in new feature subspace via bi-directional adversarial learning, and then minimizes the maximum loss of molecular divergence and bioactivity to ensure local smoothness of model outputs and significantly enhance the generalization of DGL models in predicting ligand bioactivities. Benchmark tests were implemented on seven state-of-the-art open-source DGL models with the potential of modeling ligand bioactivities, and precisely evaluated by multiple criteria. The results indicate that, on almost all 33 GPCRs datasets and seven DGL models, AFSE greatly improved their enhancement factor (top-10%, 20% and 30%), which is the most important evaluation in virtual screening of hits from compound databases, while ensuring the superior performance on RMSE and $r^2$. The web server of AFSE is freely available at http://noveldelta.com/AFSE for academic purposes.

Funder

National Natural Science Foundation of China

Graduate Research and Innovation Projects of Jiangsu Province

Natural Science Foundation of Jiangsu Province

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Link

https://academic.oup.com/bib/article-pdf/23/3/bbac077/43745700/bbac077.pdf

Reference39 articles.

1. Graph signal processing approach to qsar/qspr model learning of compounds;Song;IEEE Trans Pattern Anal Mach Intell,2020