Generality-Training of a Classifier for Improved Calibration in Unseen Contexts-Reference-Cited by-同舟云学术

Generality-Training of a Classifier for Improved Calibration in Unseen Contexts

Published:2023 Issue: Volume: Page:374-391
ISSN:0302-9743
Container-title:Machine Learning and Knowledge Discovery in Databases: Research Track
language:
Short-container-title:

Author:

Leelar Bhawani Shankar^ORCID,Kull Meelis^ORCID

Abstract

AbstractArtificial neural networks tend to output class probabilities that are miscalibrated, i.e., their reported uncertainty is not a very good indicator of how much we should trust the model. Consequently, methods have been developed to improve the model’s predictive uncertainty, both during training and post-hoc. Even if the model is calibrated on the domain used in training, it typically becomes over-confident when applied on slightly different target domains, e.g. due to perturbations or shifts in the data. The model can be recalibrated for a fixed list of target domains, but its performance can still be poor on unseen target domains. To address this issue, we propose a generality-training procedure that learns a modified head for the neural network to achieve better calibration generalization to new domains while retaining calibration performance on the given domains. This generality-head is trained on multiple domains using a new objective function with increased emphasis on the calibration loss compared to cross-entropy. Such training results in a more general model in the sense of not only better calibration but also better accuracy on unseen domains, as we demonstrate experimentally on multiple datasets. The code and supplementary for the paper is available (https://github.com/bsl-traveller/CaliGen.git).

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-43424-2_23

Reference45 articles.

1. Krishnan, R., Tickoo, O.: Improving model calibration with accuracy versus uncertainty optimization. In: Advances in Neural Information Processing Systems, vol. 33, pp. 18237–18248 (2020)

2. Kumar, A., Sarawagi, S., Jain, U.: Trainable calibration measures for neural networks from kernel mean embeddings. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 2805–2814. PMLR (2018). https://proceedings.mlr.press/v80/kumar18a.html

3. Mukhoti, J., et al.: Calibrating deep neural networks using focal loss. In: Larochelle, H., et al. (eds.) Advances in Neural Information Processing Systems, vol. 33, pp. 15288–15299. Curran Associates Inc. (2020). https://proceedings.neurips.cc/paper/2020/file/aeb7b30ef1d024a76f21a1d40e30c302-Paper.pdf

4. Lin, T.-Y., et al.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)

5. Cheng, J., Vasconcelos, N.: Calibrating deep neural networks by pairwise constraints. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13709–13718 (2022)