Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels-Reference-Cited by-同舟云学术

Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels

Published:2018-07 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Yu Shujian¹²,Wang Xiaoyang¹,C. Príncipe José²

Affiliation:

1. Nokia Bell Labs, Murray Hill, NJ, USA

2. University of Florida, Gainesville, FL, USA

Abstract

One important assumption underlying common classification models is the stationarity of the data. However, in real-world streaming applications, the data concept indicated by the joint distribution of feature and label is not stationary but drifting over time. Concept drift detection aims to detect such drifts and adapt the model so as to mitigate any deterioration in the model's predictive performance. Unfortunately, most existing concept drift detection methods rely on a strong and over-optimistic condition that the true labels are available immediately for all already classified instances. In this paper, a novel Hierarchical Hypothesis Testing framework with Request-and-Reverify strategy is developed to detect concept drifts by requesting labels only when necessary. Two methods, namely Hierarchical Hypothesis Testing with Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the novel framework. In experiments with benchmark datasets, our methods demonstrate overwhelming advantages over state-of-the-art unsupervised drift detectors. More importantly, our methods even outperform DDM (the widely used supervised drift detector) when we use significantly fewer labels.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Overview of Wind and Photovoltaic Data Stream Classification and Data Drift Issues;Energies;2024-09-01

2. slidSHAPs – sliding Shapley Values for correlation-based change detection in time series;2023 IEEE 10th International Conference on Data Science and Advanced Analytics (DSAA);2023-10-09

3. Resilient edge machine learning in smart city environments;Journal of Smart Cities and Society;2023-07-07

4. Noise tolerant drift detection method for data stream mining;Information Sciences;2022-09

5. STUDD: a student–teacher method for unsupervised concept drift detection;Machine Learning;2022-06-21