Author:
RAO Ning,XU Hua,QI Zisen,SONG Bailin,SHI Yunhao
Abstract
In order to solve the optimization of the interference resource allocation in communication network countermeasures, an interference resource allocation method based on the maximum policy entropy deep reinforcement learning (MPEDRL) was proposed. The method introduced the idea of deep reinforcement learning into the communication countermeasures resource allocation, it could enhance the exploration of the policy and accelerate the convergence to the global optimum with adding the maximum policy entropy criterion and adaptively adjusting the entropy coefficient. The method modeled interference resource allocation as Markov decision process, then established the interference strategy network to output allocation scheme, constructing the interference effect evaluation network of the clipped twin structure for efficiency evaluation, and trained the policy network and the evaluation network with the goal of maximizing the strategy entropy and the cumulative interference efficacy, then decided the optimal interference resource allocation scheme. The simulation results show that the algorithm can effectively solve the resource allocation problem in communication network confrontation, comparing with the existing deep reinforcement learning methods, it has faster learning speed and less fluctuation in the training process, and achieved 15% higher jamming efficacy than DDPG-based method.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献