1. Dreiseitl, S., Osl, M.: Testing the calibration of classification models from first principles. In: Proceedings of the AMIA Annual Fall Symposium 2012, Chicago, USA, pp. 164–169 (2012)
2. Fortet, R., Mourier, E.: Convergence de la réparation empirique vers la réparation théorique. Annales Scientifiques de l’École Normale Supérieure 70, 266–285 (1953)
3. Gama, J.I.Z., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46, 1–37 (2014)
4. Gretton, A., Borgwardt, K., Rasch, M., Schölkopf, B., Smola, A.: A kernel two-sample test. J. Mach. Learn. Res. 13, 723–773 (2012)
5. Kelly, C., Karthikesalingam, A., Suleyman, M., Corrado, G., King, D.: Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 17, 195 (2019)