Author:
Geerkens Simon,Sieberichs Christian,Braun Alexander,Waschulzik Thomas
Abstract
AbstractThe importance of high data quality is increasing with the growing impact and distribution of ML systems and big data. Also, the planned AI Act from the European commission defines challenging legal requirements for data quality especially for the market introduction of safety relevant ML systems. In this paper, we introduce a novel approach that supports the data quality assurance process of multiple data quality aspects. This approach enables the verification of quantitative data quality requirements. The concept and benefits are introduced and explained on small example data sets. How the method is applied is demonstrated on the well-known MNIST data set based an handwritten digits.
Funder
Hochschule Düsseldorf University of Applied Sciences
Publisher
Springer Science and Business Media LLC
Reference30 articles.
1. Ankerst, M., Breunig, M. M., Kriegel, H.-P., Sander, J.: OPTICS: ordering points to Identify the clustering structure. In: Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, SIGMOD ’99, 49–60. New York, NY, USA: Association for Computing Machinery. ISBN 1-58113-084-8. Event-place: Philadelphia, Pennsylvania, USA (1999)
2. Breunig, M. M., Kriegel, H.-P., Ng, R. T., Sander, J.: LOF: Identifying density-based local outliers. In: Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, SIGMOD ’00, 93–104. New York, NY, USA: Association for Computing Machinery. ISBN 1-58113-217-4. Event-place: Dallas, Texas, USA (2000)
3. Burton, S., Hellert, C., Hüger, F., Mock, M., Rohatschek, A.: Safety assurance of machine learning for perception functions. In: Fingscheidt, T., Gottschalk, H., Houben, S. (eds.) Deep Neural Networks and Data for Automated Driving, pp. 335–358. Springer International Publishing, Cham (2022)
4. Deng, L.: The MNIST database of handwritten digit images for machine learning research [best of the web]. IEEE Sig. Process. Mag. 29(6), 141–142 (2012)
5. Ester, M., Kriegel, H.-P., Sander, J., Xu, X.: A Density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, KDD’96, 226–231. AAAI Press. Event-place: Portland, Oregon (1996)
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献