Classification Confidence in Exploratory Learning: A User’s Guide-Reference-Cited by-同舟云学术

Classification Confidence in Exploratory Learning: A User’s Guide

Published:2023-07-21 Issue:3 Volume:5 Page:803-829
ISSN:2504-4990
Container-title:Machine Learning and Knowledge Extraction
language:en
Short-container-title:MAKE

Author:

Salamon Peter¹^ORCID,Salamon David¹,Cantu V. Adrian²,An Michelle³,Perry Tyler²,Edwards Robert A.⁴^ORCID,Segall Anca M.⁵^ORCID

Affiliation:

1. Department of Mathematics, San Diego State University, San Diego, CA 92182, USA

2. Computational Science Research Center, San Diego State University, San Diego, CA 92182, USA

3. Bioinformatics and Medical Informatics Program, San Diego State University, San Diego, CA 92182, USA

4. Flinders Accelerator for Microbiome Exploration, Flinders University, Flinders, Adelaide, SA 5001, Australia

5. Department of Biology, San Diego State University, San Diego, CA 92182, USA

Abstract

This paper investigates the post-hoc calibration of confidence for “exploratory” machine learning classification problems. The difficulty in these problems stems from the continuing desire to push the boundaries of which categories have enough examples to generalize from when curating datasets, and confusion regarding the validity of those categories. We argue that for such problems the “one-versus-all” approach (top-label calibration) must be used rather than the “calibrate-the-full-response-matrix” approach advocated elsewhere in the literature. We introduce and test four new algorithms designed to handle the idiosyncrasies of category-specific confidence estimation using only the test set and the final model. Chief among these methods is the use of kernel density ratios for confidence calibration including a novel algorithm for choosing the bandwidth. We test our claims and explore the limits of calibration on a bioinformatics application (PhANNs) as well as the classic MNIST benchmark. Finally, our analysis argues that post-hoc calibration should always be performed, may be performed using only the test dataset, and should be sanity-checked visually.

Funder

NIDDK

Computational and Experimental Resources for Virome Analysis in Inflammatory Bowel Disease

Publisher

MDPI AG

Subject

Artificial Intelligence,Engineering (miscellaneous)

Link

https://www.mdpi.com/2504-4990/5/3/43/pdf

Reference33 articles.

1. Gawlikowski, J., Tassi, C.R.N., Ali, M., Lee, J., Humt, M., Feng, J., Kruspe, A., Triebel, R., Jung, P., and Roscher, R. (2021). A Survey of Uncertainty in Deep Neural Networks. arXiv.

2. Kuppers, F., Kronenberger, J., Shantia, A., and Haselhoff, A. (2020, January 14–19). Multivariate Confidence Calibration for Object Detection. Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA.

3. Lakshminarayanan, B., Pritzel, A., and Blundell, C. (2017). Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles. arXiv.

4. Jiang, H., Kim, B., Guan, M.Y., and Gupta, M. (2018). To Trust or Not to Trust A Classifier. arXiv.

5. Zhang, J., Kailkhura, B., and Han, T.Y.J. (2020, January 13–18). Mix-n-match: Ensemble and compositional methods for uncertainty calibration in deep learning. Proceedings of the International Conference on Machine Learning, PMLR, Virtual Event.