An Interpretable Classification Model Using Gluten-Specific TCR Sequences Shows Diagnostic Potential in Coeliac Disease
-
Published:2023-11-25
Issue:12
Volume:13
Page:1707
-
ISSN:2218-273X
-
Container-title:Biomolecules
-
language:en
-
Short-container-title:Biomolecules
Author:
Fowler Anna1, FitzPatrick Michael2, Shanmugarasa Aberami3, Ibrahim Amro Sayed Fadel4, Kockelbergh Hannah1, Yang Han-Chieh4, Williams-Walker Amelia4, Luu Hoang Kim Ngan4ORCID, Evans Shelley4ORCID, Provine Nicholas2ORCID, Klenerman Paul25, Soilleux Elizabeth J.4ORCID
Affiliation:
1. Department of Health Data Science, Institute of Population Health, University of Liverpool, Liverpool L69 3GF, UK 2. Translational Gastroenterology Unit, Nuffield Department of Medicine, University of Oxford, Oxford OX3 9DU, UK 3. School of Clinical Medicine, University of Cambridge, Cambridge CB2 0SP, UK 4. Department of Pathology, University of Cambridge, Cambridge CB2 1QP, UK 5. Peter Medawar Building for Pathogen Research, University of Oxford, Oxford OX1 3SY, UK
Abstract
Coeliac disease (CeD) is a T-cell mediated enteropathy triggered by dietary gluten which remains substantially under-diagnosed around the world. The diagnostic gold-standard requires histological assessment of intestinal biopsies taken at endoscopy while consuming a gluten-containing diet. However, there is a lack of concordance between pathologists in histological assessment, and both endoscopy and gluten challenge are burdensome and unpleasant for patients. Identification of gluten-specific T-cell receptors (TCRs) in the TCR repertoire could provide a less subjective diagnostic test, and potentially remove the need to consume gluten. We review published gluten-specific TCR sequences, and develop an interpretable machine learning model to investigate their diagnostic potential. To investigate this, we sequenced the TCR repertoires of mucosal CD4+ T cells from 20 patients with and without CeD. These data were used as a training dataset to develop the model, then an independently published dataset of 20 patients was used as the testing dataset. We determined that this model has a training accuracy of 100% and testing accuracy of 80% for the diagnosis of CeD, including in patients on a gluten-free diet (GFD). We identified 20 CD4+ TCR sequences with the highest diagnostic potential for CeD. The sequences identified here have the potential to provide an objective diagnostic test for CeD, which does not require the consumption of gluten.
Funder
Wellcome Trust Innovate UK
Subject
Molecular Biology,Biochemistry
Reference53 articles.
1. Caio, G., Volta, U., Sapone, A., Leffler, D.A., De Giorgio, R., Catassi, C., and Fasano, A. (2019). Celiac Disease: A Comprehensive Current Review. BMC Med., 17. 2. The Genetics of Celiac Disease: A Comprehensive Review of Clinical Implications;J. Autoimmun.,2015 3. Integration of Genetic and Immunological Insights into a Model of Celiac Disease Pathogenesis;Abadie;Annu. Rev. Immunol.,2011 4. Mechanics of T Cell Receptor Gene Rearrangement;Krangel;Curr. Opin. Immunol.,2009 5. Kockelbergh, H., Evans, S., Deng, T., Clyne, E., Kyriakidou, A., Economou, A., Luu Hoang, K.N., Woodmansey, S., Foers, A., and Fowler, A. (2022). Utility of Bulk T-Cell Receptor Repertoire Sequencing Analysis in Understanding Immune Responses to COVID-19. Diagnostics, 12.
|
|