MO386: Visualising and Differentiating Kidney Disorders by Urinary Peptidomics Using a Machine Learning Approach

Author:

Siwy Justyna1,Mavrogeorgis Emmmanouil12,He Tianlin1,Mischak Harald1,Rupprecht Harald345,Beige Joachim678

Affiliation:

1. Mosaiques-Diagnostics GmbH, Hannover, Germany

2. RWTH Aachen University Hospital, Institute for Molecular Cardiovascular Research (IMCAR), Aachen, Germany

3. Klinikum Bayreuth GmbH, Department of Nephrology, Angiology and Rheumatology, Bayreuth, Germany

4. Kuratorium for Dialysis and Transplantation (KfH) Bayreuth, Bayreuth, Germany

5. Friedrich-Alexander-University Erlangen-Nürnberg, Medizincampus Oberfranken, Bayreuth, Germany

6. St. Georg Hospital Leipzig, Department of Infectious Diseases/Tropical Medicine, Nephrology/KfH Renal Unit and Rheumatology, Leipzig, Germany

7. Hospital St. Georg, Kuratorium for Dialysis and Transplantation (KfH) Renal Unit, Leipzig, Germany

8. Martin-Luther-University Halle/Wittenberg, Department of Internal Medicine II, Halle/Saale, Germany

Abstract

Abstract BACKGROUND AND AIMS Currently >20 000 native peptides in urine are known that are highly dynamic and able to display the status of different organs, especially the kidney. The characterization of urinary peptide profiles (UPP) enables the depiction of kidney disease severity, progression, fibrosis, and informs about the disease etiology. Advanced machine learning algorithms enable combining the changes in the very complex UPP associated with specific disease etiologies and reducing the dataspace to only few dimensions. Here, we show the application of a supervised machine learning pipeline for the visualization of different CKD etiologies based on high-dimensional peptidomics data, toward non-invasive disease classification. METHOD The Uniform Manifold Approximation and Projection (UMAP) algorithm was used as a novel nonlinear dimensionality-reduction technique to visualize and differentiate the UPP of patients with CKD of different etiologies. UPP of individual CKD patients (with diabetic kidney disease DKD, (n = 386), IgA nephropathy (n = 743) and vasculitis (n = 150)) and 369 healthy controls were extracted from the Human Urinary Proteome Database which contains >85 000 proteomics datasets analyzed using capillary electrophoresis coupled mass spectrometry. About 80% of the extracted datasets were used as a training and 20% as validation set. RESULTS When applying supervised-UMAP to the DKD patient and control datasets, excellent separation with an F1 score of 99.5% ± 0.9% in the training set, and 93.1% ± 3.3% in the independent test set could be observed. Subsequently, this approach was applied to differentiate controls and three kidney diseases (DKD, IgA nephropathy and vasculitis) simultaneously. In the training set an accuracy of up to 98% in DN and controls, and an overall F1 score of 93.7% ± 2.3% (Figure) was achieved. In the independent test set, accuracy decreased as expected to around 90% for controls, 83.8% for IgA nephropathy, 79.2% for DKD and 66.7% for vasculitis. The overall F1 score in the test set is 81.9% ± 2.2%. Of note, controls (n = 369) were consistently classified with the highest accuracy across all groups, the disease with smallest sample size (vasculitis, n = 150) always showed the lowest accuracy. A substantial proportion of vasculitis was classified as IgA nephropathy, which has the largest sample size of n = 743. For the validation of the pipeline the permutation test was used. Permutation test was repeated 100 times using all the samples of CKD -free controls and three kidney diseases. The resulted scores were normally distributed, with a mean of 32.5% and standard deviation of 1.2%. Compared with the true F1 score, which was calculated as 81.9% from above, the probability of obtaining such a high score by chance is very low (P < 0.01). CONCLUSION We show that UMAP combined with supervised machine learning applied to high dimensional peptidomics data, enables distinguishing multiple kidney diseases with good accuracy and with very small standard deviation between multiple train-test splits. To our knowledge, our study is the first of its kind to reduce the complexity of the urinary peptidome to a single point in space, and categorize disease etiology based on the spatial information. The approach presented has the potential to enable non-invasive differential diagnosis of kidney disease etiologies. To improve accuracy of this non-invasive method, inclusion of additional clinical parameters will be tested.

Publisher

Oxford University Press (OUP)

Subject

Transplantation,Nephrology

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3