Neuron Sensitivity Guided Test Case Selection-Reference-Cited by-同舟云学术

Neuron Sensitivity Guided Test Case Selection

Published:2024-06-12 Issue: Volume: Page:
ISSN:1049-331X
Container-title:ACM Transactions on Software Engineering and Methodology
language:en
Short-container-title:ACM Trans. Softw. Eng. Methodol.

Author:

Huang Dong¹^ORCID,Bu Qingwen²^ORCID,Fu Yichao¹^ORCID,Qing Yuhao¹^ORCID,Xie Xiaofei³^ORCID,Chen Junjie⁴^ORCID,Cui Heming¹^ORCID

Affiliation:

1. The University of Hong Kong, China

2. Shanghai Jiao Tong University, China

3. Singapore Management University, Singapore

4. College of Intelligence and Computing, Tianjin University, China

Abstract

Deep Neural Networks (DNNs) have been widely deployed in software to address various tasks (e.g., autonomous driving, medical diagnosis). However, they can also produce incorrect behaviors that result in financial losses and even threaten human safety. To reveal and repair incorrect behaviors in DNNs, developers often collect rich, unlabeled datasets from the natural world and label them to test DNN models. However, properly labeling a large number of datasets is a highly expensive and time-consuming task. To address the above-mentioned problem, we propose NSS, Neuron Sensitivity Guided Test Case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets. NSS leverages the information of the internal neuron induced by the test cases to select valuable test cases, which have high confidence in causing the model to behave incorrectly. We evaluated NSS with four widely used datasets and four well-designed DNN models compared to the state-of-the-art (SOTA) baseline methods. The results show that NSS performs well in assessing the probability of failure triggering in test cases and in the improvement capabilities of the model. Specifically, compared to the baseline approaches, NSS achieves a higher fault detection rate (e.g., when selecting 5% of the test cases from the unlabeled dataset in the MNIST&LeNet1 experiment, NSS can obtain an 81.8% fault detection rate, which is a 20% increase compared with SOTA baseline strategies).

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3672454

Reference69 articles.

1. 2022. 2021 Disengagement Report from California. https://thelastdriverlicenseholder.com/2022/02/09/2021-disengagement-report-from-california/

2. Raja Ben Abdessalem, Shiva Nejati, Lionel Claude Briand, and Thomas Stifter. 2018. Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 1016–1026. https://api.semanticscholar.org/CorpusID:47017345

3. Jimmy Ba and Rich Caruana. 2013. Do Deep Nets Really Need to be Deep?. In NIPS.

4. Mihalj Bakator and Dragica Radosav. 2018. Deep Learning and Medical Diagnosis: A Review of Literature. Multimodal Technologies and Interaction (2018).

5. Christian Birchler, Sajad Khatiri, Bill Bosshard, Alessio Gambi, and Sebastiano Panichella. 2021. Machine learning-based test selection for simulation-based testing of self-driving cars software. Empirical Software Engineering 28 (2021). https://api.semanticscholar.org/CorpusID:243847320