Sequence-based prediction of protein binding regions and drug–target interactions-Reference-Cited by-同舟云学术

Sequence-based prediction of protein binding regions and drug–target interactions

Published:2022-02-08 Issue:1 Volume:14 Page:
ISSN:1758-2946
Container-title:Journal of Cheminformatics
language:en
Short-container-title:J Cheminform

Author:

Lee Ingoo,Nam Hojung^ORCID

Abstract

AbstractIdentifying drug–target interactions (DTIs) is important for drug discovery. However, searching all drug–target spaces poses a major bottleneck. Therefore, recently many deep learning models have been proposed to address this problem. However, the developers of these deep learning models have neglected interpretability in model construction, which is closely related to a model’s performance. We hypothesized that training a model to predict important regions on a protein sequence would increase DTI prediction performance and provide a more interpretable model. Consequently, we constructed a deep learning model, named Highlights on Target Sequences (HoTS), which predicts binding regions (BRs) between a protein sequence and a drug ligand, as well as DTIs between them. To train the model, we collected complexes of protein–ligand interactions and protein sequences of binding sites and pretrained the model to predict BRs for a given protein sequence–ligand pair via object detection employing transformers. After pretraining the BR prediction, we trained the model to predict DTIs from a compound token designed to assign attention to BRs. We confirmed that training the BRs prediction model indeed improved the DTI prediction performance. The proposed HoTS model showed good performance in BR prediction on independent test datasets even though it does not use 3D structure information in its prediction. Furthermore, the HoTS model achieved the best performance in DTI prediction on test datasets. Additional analysis confirmed the appropriate attention for BRs and the importance of transformers in BR and DTI prediction. The source code is available on GitHub (https://github.com/GIST-CSBL/HoTS).

Funder

national research foundation of korea

gwangju institute of science and technology

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Computer Graphics and Computer-Aided Design,Physical and Theoretical Chemistry,Computer Science Applications

Link

https://link.springer.com/content/pdf/10.1186/s13321-022-00584-w.pdf

Reference69 articles.

1. Klebe G (2006) Virtual ligand screening: strategies, perspectives and limitations. Drug Discov Today 11(13–14):580–594. https://doi.org/10.1016/j.drudis.2006.05.012

2. Cheng T, Hao M, Takeda T, Bryant SH, Wang Y (2017) Large-scale prediction of drug–target interaction: a data-centric review. AAPS J 19(5):1264–1275. https://doi.org/10.1208/s12248-017-0092-6