Testing machine learning based systems: a systematic mapping-Reference-Cited by-同舟云学术

Testing machine learning based systems: a systematic mapping

Published:2020-09-15 Issue:6 Volume:25 Page:5193-5254
ISSN:1382-3256
Container-title:Empirical Software Engineering
language:en
Short-container-title:Empir Software Eng

Author:

Riccio Vincenzo^ORCID,Jahangirova Gunel,Stocco Andrea,Humbatova Nargiz,Weiss Michael,Tonella Paolo

Abstract

Abstract Context: A Machine Learning based System (MLS) is a software system including one or more components that learn how to perform a task from a given data set. The increasing adoption of MLSs in safety critical domains such as autonomous driving, healthcare, and finance has fostered much attention towards the quality assurance of such systems. Despite the advances in software testing, MLSs bring novel and unprecedented challenges, since their behaviour is defined jointly by the code that implements them and the data used for training them. Objective: To identify the existing solutions for functional testing of MLSs, and classify them from three different perspectives: (1) the context of the problem they address, (2) their features, and (3) their empirical evaluation. To report demographic information about the ongoing research. To identify open challenges for future research. Method: We conducted a systematic mapping study about testing techniques for MLSs driven by 33 research questions. We followed existing guidelines when defining our research protocol so as to increase the repeatability and reliability of our results. Results: We identified 70 relevant primary studies, mostly published in the last years. We identified 11 problems addressed in the literature. We investigated multiple aspects of the testing approaches, such as the used/proposed adequacy criteria, the algorithms for test input generation, and the test oracles. Conclusions: The most active research areas in MLS testing address automated scenario/input generation and test oracle creation. MLS testing is a rapidly growing and developing research area, with many open challenges, such as the generation of realistic inputs and the definition of reliable evaluation metrics and benchmarks.

Funder

ERC-2017-ADG

Publisher

Springer Science and Business Media LLC

Subject

Software

Link

https://link.springer.com/content/pdf/10.1007/s10664-020-09881-0.pdf

Reference133 articles.

1. Abdessalem RB, Nejati S, Briand LC, Stifter T (2016) Testing advanced driver assistance systems using multi-objective search and neural networks. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering. ASE, pp 63–74

2. Abdessalem RB, Nejati S, Briand LC, Stifter T (2018a) Testing vision-based control systems using learnable evolutionary algorithms. In: Proceedings of the 40th International Conference on Software Engineering, ICSE ’18. ACM, New York. https://doi.org/10.1145/3180155.3180160, pp 1016–1026

3. Abdessalem RB, Panichella A, Nejati S, Briand LC, Stifter T (2018b) Testing autonomous cars for feature interaction failures using many-objective search. In: Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ASE 2018. ACM, New York, pp 143–154. https://doi.org/10.1145/3238147.3238192