NetTCR 2.2 - Improved TCR specificity predictions by combining pan- and peptide-specific training strategies, loss-scaling and integration of sequence similarity-Reference-Cited by-同舟云学术

NetTCR 2.2 - Improved TCR specificity predictions by combining pan- and peptide-specific training strategies, loss-scaling and integration of sequence similarity

Published:2023-10-16 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Jensen Mathias Fynbo^ORCID,Nielsen Morten

Abstract

AbstractThe ability to predict binding between peptides presented by the Major Histocompatibility Complex (MHC) class I molecules and T-cell receptors (TCR) is of great interest in areas of vaccine development, cancer treatment and treatment of autoimmune diseases. However, the scarcity of paired-chain data, combined with the bias towards a few well-studied epitopes, has challenged the development of pan-specific machine-learning (ML) models with accurate predictive power towards peptides characterized by little or no TCR data. To deal with this, we here benefit from a larger paired-chain peptide-TCR dataset and explore different ML model architectures and training strategies to better deal with imbalanced data. We show that while simple changes to the architecture and training strategies results in greatly improved performance, particularly for peptides with little available data, predictions on unseen peptides remain challenging, especially for peptides distant to the training peptides. We also demonstrate that ML models can be used to detect potential outliers, and that the removal of such outliers from training further improves the overall performance. Furthermore, we show that a model combining the properties of pan-specific and peptide-specific models achieves improved performance, and that performance can be further improved by integrating similarity-based predictions, especially when a low false positive rate is desirable. Moreover, in the context of the IMMREP 2022 benchmark, this updated modeling framework archived state-of-the-art performance. Finally, we show that combining all these approaches results in acceptable predictive accuracy for peptides characterized with as little as 15 positive TCRs. This observation thus places great promise on rapidly expanding the peptide covering of the current models for predicting TCR specificity. The final NetTCR 2.2 models are available athttps://github.com/mnielLab/NetTCR-2.2, and as a web server athttps://services.healthtech.dtu.dk/services/NetTCR-2.2/.

Publisher

Cold Spring Harbor Laboratory

Reference33 articles.

1. T-cell antigen receptor genes and T-cell recognition

2. Immunoinformatics: Predicting Peptide–MHC Binding

3. Can we predict T cell specificity with digital biology and machine learning?

4. Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification

5. A framework for highly multiplexed dextramer mapping and prediction of T cell receptor sequences to antigen specificity

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Development and use of machine learning algorithms in vaccine target selection;npj Vaccines;2024-01-20

2. TSpred: a robust prediction framework for TCR-epitope interactions based on an ensemble deep learning approach using paired chain TCR sequence data;2023-12-06