Abstract
ABSTRACTGenerating T-cell receptors (TCRs) with desired epitope-binding properties is a fundamental step in the development of immunotherapies, yet heavily relies on laborious and expensive wet experiments. Recent advancements in generative artificial intelligence have demonstrated promising power in protein design and engineering. In this regard, we propose a large language model, termed Epitope-Receptor-Transformer (ERTransformer), for thede novogeneration of TCRs with the desired epitope-binding property. ERTransformer is built on EpitopeBERT and ReceptorBERT, which are trained using 1.9 million epitope sequences and 33.1 million TCR sequences, respectively. To demonstrate the model capability, we generate 1000 TCRs for each of the five epitopes with known natural TCRs. The artificial TCRs exhibit low sequence identity (average Bit-score 27.64 with a standard deviation of 1.50) but high biological function similarity (average BLOSUM62 score 32.32 with a standard deviation of 12.01) to natural TCRs. Furthermore, the artificial TCRs are not very structurally identical to natural ones (average RMSD 2.84 Å with a standard deviation of 1.21 Å) but exhibit a comparable binding affinity towards the corresponding epitopes. Our work highlights the tremendous potential of applying ERTransformer to generate novel TCRs with desired epitope-binding ability.
Publisher
Cold Spring Harbor Laboratory