Security Development Lifecycle-Based Adaptive Reward Mechanism for Reinforcement Learning in Continuous Integration Testing Optimization-Reference-Cited by-同舟云学术

Security Development Lifecycle-Based Adaptive Reward Mechanism for Reinforcement Learning in Continuous Integration Testing Optimization

Published:2024-06-07 Issue: Volume: Page:1-27
ISSN:0218-1940
Container-title:International Journal of Software Engineering and Knowledge Engineering
language:en
Short-container-title:Int. J. Soft. Eng. Knowl. Eng.

Author:

Yang Yang¹^ORCID,Wang Weiwei²^ORCID,Li Zheng³^ORCID,Zhang Lieshan¹^ORCID,Pan Chaoyue³^ORCID

Affiliation:

1. School of Information Science and Engineering, Zhejiang Sci-Tech University, Zhejiang 310018, P. R. China

2. School of Information Engineering, Beijing Institute of Petrochemical Technology, Beijing 100095, P. R. China

3. College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, P. R. China

Abstract

Continuous automated testing throughout each cycle can ensure the security of the continuous integration (CI) development lifecycle. Test case prioritization (TCP) is a critical factor in optimizing automated testing, which prioritizes potentially failed test cases and improves the efficiency of automated testing. In CI automated testing, the TCP is a continuous decision-making process that can be solved with reinforcement learning (RL). RL-based CITCP can continuously generate a TCP strategy for each CI development lifecycle, with the reward mechanism as the core. The reward mechanism consists of the reward function and the reward strategy. However, there are new challenges to RL-based CITCP in real-industry CI testing. With high-frequency iteration, the reward function is often calculated with a fixed length of historical information, ignoring the spatial characteristics of the current cycle. Therefore, the dynamic time window (DTW)-based reward function is proposed to perform the reward calculation, which adaptively adjusts the recent historical information range based on the integration cycle. Moreover, with low-failure testing, the reward strategy usually only rewards failure test cases, which creates a sparse reward problem in RL. To address this issue, the similarity-based reward strategy is proposed, which increases the reward objects of some passed test cases, similar to the failure test cases. The DTW-based reward function and the similarity-based reward strategy together constitute the proposed adaptive reward mechanism in RL-based CITCP. To validate the effectiveness of the adaptive reward mechanism, experimental verification is carried out on 13 industrial data sets. The experimental results show that the adaptive reward mechanism can improve the TCP effect, where the average NAPFD is maximally improved by 7.29%, the average Recall is maximally improved by 6.04% and the average TTF is improved by 6.81 positions with a maximum of 63.77.

Publisher

World Scientific Pub Co Pte Ltd

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218194024500244

Reference48 articles.

1. Pushing the Boundaries of Testing and Continuous Integration

2. The impact of continuous integration on other software development practices: A large-scale empirical study

3. Test prioritization in continuous integration environments

4. Work practices and challenges in continuous integration: A survey with Travis CI users

5. Test-case prioritization: achievements and challenges