Expert-level detection of M-proteins in serum protein electrophoresis using machine learning
Author:
Elfert Eike1, Kaminski Wolfgang E.12, Matek Christian3, Hoermann Gregor4ORCID, Axelsen Eyvind W.56, Marr Carsten3, Piehler Armin P.45ORCID
Affiliation:
1. Medical Faculty Mannheim , University of Heidelberg , Mannheim , Germany 2. Ingenium Digital Diagnostics GmbH , Frankfurt , Germany 3. Institute of AI for Health, Helmholtz Munich – German Research Center for Environmental Health , Neuherberg , Germany 4. 535021 MLL Munich Leukemia Laboratory , Munich , Germany 5. Fürst Medical Laboratory , Oslo , Norway 6. Department of Informatics , University of Oslo , Oslo , Norway
Abstract
Abstract
Objectives
Serum protein electrophoresis (SPE) in combination with immunotyping (IMT) is the diagnostic standard for detecting monoclonal proteins (M-proteins). However, interpretation of SPE and IMT is weakly standardized, time consuming and investigator dependent. Here, we present five machine learning (ML) approaches for automated detection of M-proteins on SPE on an unprecedented large and well-curated data set and compare the performance with that of laboratory experts.
Methods
SPE and IMT were performed in serum samples from 69,722 individuals from Norway. IMT results were used to label the samples as M-protein present (positive, n=4,273) or absent (negative n=65,449). Four feature-based ML algorithms and one convolutional neural network (CNN) were trained on 68,722 randomly selected SPE patterns to detect M-proteins. Algorithm performance was compared to that of an expert group of clinical pathologists and laboratory technicians (n=10) on a test set of 1,000 samples.
Results
The random forest classifier showed the best performance (F1-Score 93.2 %, accuracy 99.1 %, sensitivity 89.9 %, specificity 99.8 %, positive predictive value 96.9 %, negative predictive value 99.3 %) and outperformed the experts (F1-Score 61.2 ± 16.0 %, accuracy 89.2 ± 10.2 %, sensitivity 94.3 ± 2.8 %, specificity 88.9 ± 10.9 %, positive predictive value 47.3 ± 16.2 %, negative predictive value 99.5 ± 0.2 %) on the test set. Interestingly the performance of the RFC saturated, the CNN performance increased steadily within our training set (n=68,722).
Conclusions
Feature-based ML systems are capable of automated detection of M-proteins on SPE beyond expert-level and show potential for use in the clinical laboratory.
Funder
H2020 European Research Council
Publisher
Walter de Gruyter GmbH
Reference42 articles.
1. Kyle, RA, Larson, DR, Therneau, TM, Dispenzieri, A, Kumar, S, Cerhan, JR, et al.. Long-term follow-up of monoclonal gammopathy of undetermined significance. N Engl J Med 2018;378:241–9. https://doi.org/10.1056/nejmoa1709974. 2. Bray, F, Ferlay, J, Soerjomataram, I, Siegel, RL, Torre, LA, Jemal, A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA A Cancer J Clin 2018;68:394–424. https://doi.org/10.3322/caac.21492. 3. Rajkumar, SV, Dimopoulos, MA, Palumbo, A, Blade, J, Merlini, G, Mateos, MV, et al.. International Myeloma Working Group updated criteria for the diagnosis of multiple myeloma. Lancet Oncol 2014;15:538–48. https://doi.org/10.1016/s1470-2045(14)70442-5. 4. Tate, JR. The paraprotein – an enduring biomarker. Clin Biochem Rev 2019;40:5–22. 5. Harris, NS, Winter, WE. Multiple myeloma and related serum protein disorders: an electrophoretic guide. New York: Demos Medical Publishing; 2012.
|
|