Maintaining AUC and H-measure over time-Reference-Cited by-同舟云学术

Maintaining AUC and H-measure over time

Published:2021-12-14 Issue: Volume: Page:
ISSN:0885-6125
Container-title:Machine Learning
language:en
Short-container-title:Mach Learn

Author:

Tatti Nikolaj^ORCID

Abstract

AbstractMeasuring the performance of a classifier is a vital task in machine learning. The running time of an algorithm that computes the measure plays a very small role in an offline setting, for example, when the classifier is being developed by a researcher. However, the running time becomes more crucial if our goal is to monitor the performance of a classifier over time. In this paper we study three algorithms for maintaining two measures. The first algorithm maintains area under the ROC curve (AUC) under addition and deletion of data points in

$$\mathcal {O} \mathopen {}\left( \log n\right)$$

O log n time. This is done by maintaining the data points sorted in a self-balanced search tree. In addition, we augment the search tree that allows us to query the ROC coordinates of a data point in

$$\mathcal {O} \mathopen {}\left( \log n\right)$$

O log n time. In doing so we are able to maintain AUC in

$$\mathcal {O} \mathopen {}\left( \log n\right)$$

O log n time. Our next two algorithms involve in maintaining H-measure, an alternative measure based on the ROC curve. Computing the measure is a two-step process: first we need to compute a convex hull of the ROC curve, followed by a sum over the convex hull. We demonstrate that we can maintain the convex hull using a minor modification of the classic convex hull maintenance algorithm. We then show that under certain conditions, we can compute the H-measure exactly in

$$\mathcal {O} \mathopen {}\left( \log ^2 n\right)$$

O log 2 n time, and if the conditions are not met, then we can estimate the H-measure in

$$\mathcal {O} \mathopen {}\left( (\log n + \epsilon ^{-1})\log n\right)$$

O ( log n + ϵ - 1 ) log n time. We show empirically that our methods are significantly faster than the baselines.

Funder

University of Helsinki including Helsinki University Central Hospital

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Software

Link

https://link.springer.com/content/pdf/10.1007/s10994-021-06084-6.pdf

Reference20 articles.

1. Ataman, K., Streetr, W., & Zhang, Y. (2006). Learning to rank by maximizing auc with linear programming. In IEEE international joint conference on neural networks, IJCNN’06, 2006 (pp. 123–129).

2. Bifet, A., & Frank, E. (2010). Sentiment knowledge discovery in twitter streaming data. In Discovery science (pp. 1–15). Springer.

3. Bouckaert, R.R. (2006). Efficient AUC learning curve calculation. In Australasian joint conference on artificial intelligence (pp. 181–191).

4. Brefeld, U., & Scheffer, T. (2005). Auc maximizing support vector learning. In Proceedings of the ICML 2005 workshop on ROC analysis in machine learning

5. Brodal, G.S., & Jacob, R. (2002). Dynamic planar convex hull. In The 43rd annual IEEE symposium on foundations of computer science, 2002. Proceedings (pp. 617–626). IEEE