GUESS: projecting machine learning scores to well-calibrated probability estimates for clinical decision-making-Reference-Cited by-同舟云学术

GUESS: projecting machine learning scores to well-calibrated probability estimates for clinical decision-making

Published:2018-11-29 Issue:14 Volume:35 Page:2458-2465
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Schwarz Johanna,Heider Dominik

Abstract

Abstract Motivation Clinical decision support systems have been applied in numerous fields, ranging from cancer survival toward drug resistance prediction. Nevertheless, clinical decision support systems typically have a caveat: many of them are perceived as black-boxes by non-experts and, unfortunately, the obtained scores cannot usually be interpreted as class probability estimates. In probability-focused medical applications, it is not sufficient to perform well with regards to discrimination and, consequently, various calibration methods have been developed to enable probabilistic interpretation. The aims of this study were (i) to develop a tool for fast and comparative analysis of different calibration methods, (ii) to demonstrate their limitations for the use on clinical data and (iii) to introduce our novel method GUESS. Results We compared the performances of two different state-of-the-art calibration methods, namely histogram binning and Bayesian Binning in Quantiles, as well as our novel method GUESS on both, simulated and real-world datasets. GUESS demonstrated calibration performance comparable to the state-of-the-art methods and always retained accurate class discrimination. GUESS showed superior calibration performance in small datasets and therefore may be an optimal calibration method for typical clinical datasets. Moreover, we provide a framework (CalibratR) for R, which can be used to identify the most suitable calibration method for novel datasets in a timely and efficient manner. Using calibrated probability estimates instead of original classifier scores will contribute to the acceptance and dissemination of machine learning based classification models in cost-sensitive applications, such as clinical research. Availability and implementation GUESS as part of CalibratR can be downloaded at CRAN.

Funder

Deichmann foundation

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/article-pdf/35/14/2458/28913220/bty984.pdf

Reference34 articles.

1. De novo pathway-based biomarker identification;Alcaraz;Nucleic Acids Res,2017

2. Linking cytoscape and the corynebacterial reference database coryneregnet;Baumbach;BMC Genomics,2008

3. The end of medicine as we know it;Baumbach;Syst. Med,2018

4. The GALAD scoring algorithm based on AFP, AFP-l3, and DCP significantly improves detection of BCLC early stage hepatocellular carcinoma;Best;Z. Gastroenterol,2016

5. Big data and machine learning in radiation oncology: state of the art and future prospects;Bibault;Cancer Lett,2016

Cited by 32 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Rising Temperatures, Falling Leaves: Predicting the Fate of Cyprus’s Endemic Oak under Climate and Land Use Change;Plants;2024-04-16

2. POLAR: prediction of prolonged mechanical ventilation in patients with myasthenic crisis;Journal of Neurology;2024-02-08

3. Assessing the Vulnerability of Medicinal and Aromatic Plants to Climate and Land-Use Changes in a Mediterranean Biodiversity Hotspot;Land;2024-01-24

4. BioPRO: Context-Infused Prompt Learning for Biomedical Entity Linking;IEEE/ACM Transactions on Audio, Speech, and Language Processing;2024

5. Conservation Responsibility for Priority Habitats under Future Climate Conditions: A Case Study on Juniperus drupacea Forests in Greece;Land;2023-10-26