TCR-H: Machine Learning Prediction of T-cell Receptor Epitope Binding on Unseen Datasets-Reference-Cited by-同舟云学术

TCR-H: Machine Learning Prediction of T-cell Receptor Epitope Binding on Unseen Datasets

Published:2023-11-29 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

T. Rajitha Rajeshwar,Demerdash Omar,Smith Jeremy C.

Abstract

AbstractAI/ML approaches to predicting T-cell receptor (TCR) epitope specificity achieve high performance metrics on test datasets which include sequences that are also part of the training set but fail to generalize to test sets consisting of epitopes and TCRs that are absent from the training set, i.e., unseen. We present TCR-H, a supervised classification Support Vector Machines model using physicochemical features trained on the largest dataset available to date using only experimentally validated non-binders as negative datapoints. TCR-H exhibits an area under the curve of the receiver-operator characteristic (AUC of ROC) of 0.87 for epitope ‘hard splitting’ (i.e., on test sets with all epitopes unseen), 0.92 for TCR hard splitting and 0.89 for ‘strict splitting’ in which neither the epitopes nor the TCRs in the test set are seen in the training data. TCR-H may thus represent a significant step towards general applicability of epitope:TCR specificity prediction.

Publisher

Cold Spring Harbor Laboratory

Reference38 articles.

1. Combining Three-Dimensional Modeling with Artificial Intelligence to Increase Specificity and Precision in Peptide–MHC Binding Predictions

2. Solving the protein sequence metric problem

3. Single-cell transcriptomic profiling of the zebrafish inner ear reveals molecularly distinct hair cell and supporting cell subtypes

4. ATM-TCR: TCR-Epitope Binding Affinity Prediction Using a Multi-Head Self-Attention Model