Imbalance-Robust Multi-Label Self-Adjusting kNN-Reference-Cited by-同舟云学术

Imbalance-Robust Multi-Label Self-Adjusting kNN

Published:2024-07-26 Issue:8 Volume:18 Page:1-30
ISSN:1556-4681
Container-title:ACM Transactions on Knowledge Discovery from Data
language:en
Short-container-title:ACM Trans. Knowl. Discov. Data

Author:

Nicola Victor Gomes De Oliveira Martins¹^ORCID,Delgado Karina Valdivia¹^ORCID,Lauretto Marcelo de Souza¹^ORCID

Affiliation:

1. University of São Paulo, São Paulo, Brazil

Abstract

In the task of multi-label classification in data streams, instances arriving in real-time need to be associated with multiple labels simultaneously. Various methods based on the k Nearest Neighbors algorithm have been proposed to address this task. However, these methods face limitations when dealing with imbalanced data streams, a problem that has received limited attention in existing works. To approach this gap, this article introduces the Imbalance-Robust Multi-Label Self-Adjusting kNN (IRMLSAkNN), designed to tackle multi-label imbalanced data streams. IRMLSAkNN’s strength relies on maintaining relevant instances with imbalance labels by using a discarding mechanism that considers the imbalance ratio per label. On the other hand, it evaluates subwindows with an imbalance-aware measure to discard older instances that are lacking performance. We conducted statistical experiments on 32 benchmark data streams, evaluating IRMLSAkNN against eight multi-label classification algorithms using common accuracy-aware and imbalance-aware measures. The obtained results demonstrate that IRMLSAkNN consistently outperforms these algorithms in terms of predictive capacity and time cost across various levels of imbalance.

Funder

CEPID-CeMEAI-Center for Mathematical Sciences Applied to Industry

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3663575

Reference52 articles.

1. A survey on learning from imbalanced data streams: Taxonomy, challenges, empirical study, and reproducible experimental framework;Aguiar Gabriel;Machine Learning,2022

2. Adaptive ensemble of self-adjusting nearest neighbor subspaces for multi-label drifting data streams

3. Albert Bifet and Ricard Gavalda. 2007. Learning from time-changing data with adaptive windowing. In Proceedings of the SIAM International Conference on Data Mining (SDM ’07). 443–448.

4. New ensemble methods for evolving data streams