Neuron Sensitivity Guided Test Case Selection

Author:

Huang Dong1ORCID,Bu Qingwen2ORCID,Fu Yichao1ORCID,Qing Yuhao1ORCID,Xie Xiaofei3ORCID,Chen Junjie4ORCID,Cui Heming1ORCID

Affiliation:

1. The University of Hong Kong, China

2. Shanghai Jiao Tong University, China

3. Singapore Management University, Singapore

4. College of Intelligence and Computing, Tianjin University, China

Abstract

Deep Neural Networks (DNNs) have been widely deployed in software to address various tasks (e.g., autonomous driving, medical diagnosis). However, they can also produce incorrect behaviors that result in financial losses and even threaten human safety. To reveal and repair incorrect behaviors in DNNs, developers often collect rich, unlabeled datasets from the natural world and label them to test DNN models. However, properly labeling a large number of datasets is a highly expensive and time-consuming task. To address the above-mentioned problem, we propose NSS, Neuron Sensitivity Guided Test Case Selection, which can reduce the labeling time by selecting valuable test cases from unlabeled datasets. NSS leverages the information of the internal neuron induced by the test cases to select valuable test cases, which have high confidence in causing the model to behave incorrectly. We evaluated NSS with four widely used datasets and four well-designed DNN models compared to the state-of-the-art (SOTA) baseline methods. The results show that NSS performs well in assessing the probability of failure triggering in test cases and in the improvement capabilities of the model. Specifically, compared to the baseline approaches, NSS achieves a higher fault detection rate (e.g., when selecting 5% of the test cases from the unlabeled dataset in the MNIST&LeNet1 experiment, NSS can obtain an 81.8% fault detection rate, which is a 20% increase compared with SOTA baseline strategies).

Publisher

Association for Computing Machinery (ACM)

Reference69 articles.

1. 2022. 2021 Disengagement Report from California. https://thelastdriverlicenseholder.com/2022/02/09/2021-disengagement-report-from-california/

2. Raja Ben Abdessalem, Shiva Nejati, Lionel Claude Briand, and Thomas Stifter. 2018. Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms. 2018 IEEE/ACM 40th International Conference on Software Engineering (ICSE) (2018), 1016–1026. https://api.semanticscholar.org/CorpusID:47017345

3. Jimmy Ba and Rich Caruana. 2013. Do Deep Nets Really Need to be Deep?. In NIPS.

4. Mihalj Bakator and Dragica Radosav. 2018. Deep Learning and Medical Diagnosis: A Review of Literature. Multimodal Technologies and Interaction (2018).

5. Christian Birchler, Sajad Khatiri, Bill Bosshard, Alessio Gambi, and Sebastiano Panichella. 2021. Machine learning-based test selection for simulation-based testing of self-driving cars software. Empirical Software Engineering 28 (2021). https://api.semanticscholar.org/CorpusID:243847320

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3