Author:
Zhang Chenkang,Tian Haobing,Zhang Lang,Jiao Pengju
Abstract
AbstractAUC (area under the ROC curve) is an essential metric that has been extensively researched in the field of machine learning. Traditional AUC optimization methods need a large-scale clean dataset, while real-world datasets usually contain massive noisy samples. To reduce the impact of noisy samples, many robust AUC optimization methods have been proposed. However, these methods only use noisy data and ignore the effect of clean data. To make full use of clean data and noisy data, in this paper, we propose a new framework for AUC optimization which uses clean samples to guide the processing of the noisy dataset based on the technology of self-paced learning (SPL). Innovatively, we introduce the consistency regularization term to reduce the negative impact of the data enhancement technology on SPL. Traditional SPL methods usually suffer from the high complexity of alternately solving the two critical sub-problems with respect to sample weights and model parameters. To speed up the training process, we propose a new efficient algorithm to solve our problem, which alternately updates sample weights and model parameters with the stochastic gradient method. Theoretically, we prove that our new optimization method can converge to a stationary point. Comprehensive experiments demonstrate that our robust AUC optimization (RAUCO) algorithm holds better robustness than existing algorithms.
Publisher
Springer Science and Business Media LLC
Reference34 articles.
1. Hoo, Z. H., Candlish, J. & Teare, D. What is an ROC curve? (2017).
2. McKnight, P. E. & Najab, J. Mann-Whitney u test. In The Corsini Encyclopedia of Psychology 1–1 (2010).
3. Hajian-Tilaki, K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Casp. J. Intern. Med. 4, 627 (2013).
4. Alakus, T. B. & Turkoglu, I. Comparison of deep learning approaches to predict covid-19 infection. Chaos Solitons Fractals 140, 110120 (2020).
5. Huang, M. et al. AUC-oriented graph neural network for fraud detection. In Proceedings of the ACM Web Conference, vol. 2022, 1311–1321 (2022).