Affiliation:
1. University of Ottawa, Ottawa, Canada
2. École de technologie supérieure, Montreal, Canada
3. Lero SFI Centre on Software Research and University of Limerick, Limerick, Ireland
Abstract
Deep neural networks (DNNs) are widely used in various application domains such as image processing, speech recognition, and natural language processing. However, testing DNN models may be challenging due to the complexity and size of their input domain. In particular, testing DNN models often requires generating or exploring large unlabeled datasets. In practice, DNN test oracles, which identify the correct outputs for inputs, often require expensive manual effort to label test data, possibly involving multiple experts to ensure labeling correctness. In this article, we propose
DeepGD
, a black-box multi-objective test selection approach for DNN models. It reduces the cost of labeling by prioritizing the selection of test inputs with high fault-revealing power from large unlabeled datasets.
DeepGD
not only selects test inputs with high uncertainty scores to trigger as many mispredicted inputs as possible but also maximizes the probability of revealing distinct faults in the DNN model by selecting diverse mispredicted inputs. The experimental results conducted on four widely used datasets and five DNN models show that in terms of fault-revealing ability, (1) white-box, coverage-based approaches fare poorly, (2)
DeepGD
outperforms existing black-box test selection approaches in terms of fault detection, and (3)
DeepGD
also leads to better guidance for DNN model retraining when using selected inputs to augment the training set.
Funder
Natural Sciences and Engineering Research Council of Canada
Publisher
Association for Computing Machinery (ACM)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献