Author:
Azimi Shelernaz,Pahl Claus
Abstract
AbstractMachine learning is used widely to create a range of prediction or classification models. The quality of the machine learning (ML) models depends not only on the model creation process, but also on the input data quality. We investigate here the impact of data quality on the quality of the ML model in a generic way. The aim is to identify a possible data quality problem based on observed anomalies in the ML model over time. This is achieved in the form of a root cause analysis of anomalies detected in the ML model. We develop a generic anomaly detection and analysis framework and demonstrate its application to two prediction scenarios based on sensor data.
Funder
Libera Università di Bolzano
Publisher
Springer Science and Business Media LLC