Abstract
AbstractIn this paper, we study the post-hoc calibration of modern neural networks, a problem that has drawn a lot of attention in recent years. Despite the plethora of calibration methods proposed, there is no consensus yet on the inherent complexity of the task and, while some authors claim that simple functions solve the problem, others suggest that more expressive models are needed to capture misscalibration. As a first approach, we focus on the task of confidence scaling, specifically on post-hoc methods that generalize Temperature Scaling, which we refer to as the Adaptive Temperature Scaling family. We begin by demonstrating that while complex models like neural networks provide an advantage when there is ample data, they fail in scenarios where it is limited, notably common in fields like medical diagnosis. We then show how under this ideal data conditions the more expressive methods learn a relationship between the entropy of a prediction and its level of overconfidence, and based on this observation, we propose Entropy-based Temperature Scaling, a simple method that scales the confidence of a prediction according to this relationship. Results show that our method obtains state-of-the-art performance and is robust against data scarcity. Moreover, our proposed model enables a deeper understanding of the calibration process by the interpretation of the entropy as a measure of uncertainty in the network outputs.
Funder
Ministerio de Ciencia e Innovación
Universidad Autónoma de Madrid
Publisher
Springer Science and Business Media LLC