Predicting monoclonal antibody binding sequences from a sparse sampling of all possible sequences-Reference-Cited by-同舟云学术

Predicting monoclonal antibody binding sequences from a sparse sampling of all possible sequences

Published:2024-08-12 Issue:1 Volume:7 Page:
ISSN:2399-3642
Container-title:Communications Biology
language:en
Short-container-title:Commun Biol

Author:

Bisarad Pritha^ORCID,Kelbauskas Laimonas,Singh Akanksha^ORCID,Taguchi Alexander T.^ORCID,Trenchevska Olgica,Woodbury Neal W.^ORCID

Abstract

AbstractPrevious work has shown that binding of target proteins to a sparse, unbiased sample of all possible peptide sequences is sufficient to train a machine learning model that can then predict, with statistically high accuracy, target binding to any possible peptide sequence of similar length. Here, highly sequence-specific molecular recognition is explored by measuring binding of 8 monoclonal antibodies (mAbs) with specific linear cognate epitopes to an array containing 121,715 near-random sequences about 10 residues in length. Network models trained on resulting sequence-binding values are used to predict the binding of each mAb to its cognate sequence and to an in silico generated one million random sequences. The model always ranks the binding of the cognate sequence in the top 100 sequences, and for 6 of the 8 mAbs, the cognate sequence ranks in the top ten. Practically, this approach has potential utility in selecting highly specific mAbs for therapeutics or diagnostics. More fundamentally, this demonstrates that very sparse random sampling of a large amino acid sequence spaces is sufficient to generate comprehensive models predictive of highly specific molecular recognition.

Funder

Arizona State University

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s42003-024-06650-3.pdf

Reference36 articles.

1. Braghetto, A., Orlandini, E. & Baiesi, M. Interpretable Machine Learning of Amino Acid Patterns in Proteins: A Statistical Ensemble Approach. J. Chem. Theory Comput. 19, 6011–6022 (2023).

2. ElAbd, H. et al. Amino acid encoding for deep learning applications. Bmc Bioinforma. 21, 235 (2020).

3. Johnston, K. E. et al. Machine Learning for Protein Engineering. Preprint at https://arxiv.org/abs/2305.16634 (2023).

4. Xu, Y. et al. Deep Dive into Machine Learning Models for Protein Engineering. J. Chem. Inf. Model. 60, 2773–2790 (2020).

5. Legutki, J. B. et al. Scalable High-Density Peptide Arrays for Comprehensive Health Monitoring. Nat. Commun. 5, 4785 (2014).