Learning the Drug-Target Interaction Lexicon-Reference-Cited by-同舟云学术

Learning the Drug-Target Interaction Lexicon

Published:2022-12-10 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Singh Rohit^ORCID,Sledzieski Samuel^ORCID,Cowen Lenore^ORCID,Berger Bonnie^ORCID

Abstract

Sequence-based prediction of drug-target interactions has the potential to accelerate drug discovery by complementing experimental screens. Such computational prediction needs to be generalizable and scalable while remaining sensitive to subtle variations in the inputs. However, current computational techniques fail to simultaneously meet these goals, often sacrificing performance on one to achieve the others. We develop a deep learning model, ConPLex, successfully leveraging the advances in pre-trained protein language models (“PLex”) and employing a novel protein-anchored contrastive co-embedding (“Con”) to outperform state-of-the-art approaches. ConPLex achieves high accuracy, broad adaptivity to unseen data, and specificity against decoy compounds. It makes predictions of binding based on the distance between learned representations, enabling predictions at the scale of massive compound libraries and the human proteome. Furthermore, ConPLex is interpretable, which enables us to visualize the drug-target lexicon and use embeddings to characterize the function of human cell-surface proteins. We anticipate ConPLex will facilitate novel drug discovery by making highly sensitive and interpretable in-silico drug screening feasible at genome scale. Con-PLex is available open-source athttps://github.com/samsledje/ConPLex.Significance StatementIn time and money, one of the most expensive steps of the drug discovery pipeline is the experimental screening of small molecules to see which will bind to a protein target of interest. Therefore, accurate high-throughput computational prediction of drug-target interactions would unlock significant value, guiding and prioritizing promising candidates for experimental screening. We introduce ConPLex, a machine learning method for predicting drug-target binding which achieves state-of-the-art accuracy on many types of targets by using a pre-trained protein language model. The approach co-locates the proteins and the potential drug molecules in a shared feature space while learning to contrast true drugs from similar non-binding “decoy” molecules. ConPLex is extremely fast, which allows it to rapidly shortlist candidates for deeper investigation.

Publisher

Cold Spring Harbor Laboratory

Reference71 articles.

1. Highly accurate protein structure prediction with AlphaFold

2. Accurate prediction of protein structures and interactions using a three-track neural network

3. Wu R , et al. (2022) High-resolution de novo structure prediction from primary se-quence. bioRxiv.

4. Molecular docking: shifting paradigms in drug discovery;International journal of molecular sciences,2019

5. Rational design of thiolase substrate specificity for metabolic engineering applications