Abstract
This paper addressed the optimal policy selection problem of attacker and sensor in cyber-physical systems (CPSs) under denial of service (DoS) attacks. Since the sensor and the attacker have opposite goals, a two-player zero-sum game is introduced to describe the game between the sensor and the attacker, and the Nash equilibrium strategies are studied to obtain the optimal actions. In order to effectively evaluate and quantify the gains, a reinforcement learning algorithm is proposed to dynamically adjust the corresponding strategies. Furthermore, security state estimation is introduced to evaluate the impact of offensive and defensive strategies on CPSs. In the algorithm, the ε-greedy policy is improved to make optimal choices based on sufficient learning, achieving a balance of exploration and exploitation. It is worth noting that the channel reliability factor is considered in order to study CPSs with multiple reasons for packet loss. The reinforcement learning algorithm is designed in two scenarios: reliable channel (that is, the reason for packet loss is only DoS attacks) and unreliable channel (the reason for packet loss is not entirely from DoS attacks). The simulation results of the two scenarios show that the proposed reinforcement learning algorithm can quickly converge to the Nash equilibrium policies of both sides, proving the availability and effectiveness of the algorithm.
Funder
National Natural Science Foundation of China
Fundamental Research Funds for the Central Universities of China
Ningbo Natural Science Foundation
Subject
Control and Optimization,Control and Systems Engineering
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献