AdaCC: cumulative cost-sensitive boosting for imbalanced classification-Reference-Cited by-同舟云学术

AdaCC: cumulative cost-sensitive boosting for imbalanced classification

Published:2022-11-02 Issue:2 Volume:65 Page:789-826
ISSN:0219-1377
Container-title:Knowledge and Information Systems
language:en
Short-container-title:Knowl Inf Syst

Author:

Iosifidis Vasileios^ORCID,Papadopoulos Symeon,Rosenhahn Bodo,Ntoutsi Eirini

Abstract

AbstractClass imbalance poses a major challenge for machine learning as most supervised learning models might exhibit bias towards the majority class and under-perform in the minority class. Cost-sensitive learning tackles this problem by treating the classes differently, formulated typically via a user-defined fixed misclassification cost matrix provided as input to the learner. Such parameter tuning is a challenging task that requires domain knowledge and moreover, wrong adjustments might lead to overall predictive performance deterioration. In this work, we propose a novel cost-sensitive boosting approach for imbalanced data that dynamically adjusts the misclassification costs over the boosting rounds in response to model’s performance instead of using a fixed misclassification cost matrix. Our method, called AdaCC, is parameter-free as it relies on the cumulative behavior of the boosting model in order to adjust the misclassification costs for the next boosting round and comes with theoretical guarantees regarding the training error. Experiments on 27 real-world datasets from different domains with high class imbalance demonstrate the superiority of our method over 12 state-of-the-art cost-sensitive boosting approaches exhibiting consistent improvements in different measures, for instance, in the range of [0.3–28.56%] for AUC, [3.4–21.4%] for balanced accuracy, [4.8–45%] for gmean and [7.4–85.5%] for recall.

Funder

Gottfried Wilhelm Leibniz Universität Hannover

Publisher

Springer Science and Business Media LLC

Subject

Artificial Intelligence,Hardware and Architecture,Human-Computer Interaction,Information Systems,Software

Link

https://link.springer.com/content/pdf/10.1007/s10115-022-01780-8.pdf

Reference55 articles.

1. Bradford JP, Kunz C, Kohavi R, Brunk C, Brodley CE(1998) Pruning decision trees with misclassification costs. In: Nedellec C, Rouveirol C (eds) Machine learning: ECML-98, 10th European conference on machine learning, Chemnitz, Germany, April 21–23, 1998, Proceedings, Lecture notes in computer science, vol 1398. Springer, pp 131–136. https://doi.org/10.1007/BFb0026682

2. Brennan P (2012) A comprehensive survey of methods for overcoming the class imbalance problem in fraud detection. Institute of Technology Blanchardstown, Dublin

3. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357. https://doi.org/10.1613/jair.953

4. Chawla NV, Lazarevic A, Hall LO, Bowyer KW (2003) Smoteboost: improving prediction of the minority class in boosting. In: Lavrac N, Gamberger D, Blockeel H, Todorovski L (eds) Knowledge discovery in databases: PKDD 2003, 7th European conference on principles and practice of knowledge discovery in databases, Cavtat-Dubrovnik, Croatia, September 22–26, 2003, Proceedings, Lecture notes in computer science, vol 2838. Springer, pp. 107–119. https://doi.org/10.1007/978-3-540-39804-2_12

5. Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. An empirical evaluation of imbalanced data strategies from a practitioner’s point of view;Expert Systems with Applications;2024-12

2. CIRA: Class imbalance resilient adaptive Gaussian process classifier;Knowledge-Based Systems;2024-11

3. A post-processing framework for class-imbalanced learning in a transductive setting;Expert Systems with Applications;2024-09

4. IMBoost: A New Weighting Factor for Boosting to Improve the Classification Performance of Imbalanced Data;Complexity;2023-11-11