Abstract
Tactical UAV path planning under radar threat using reinforcement learning involves particular challenges ranging from modeling related difficulties to sparse feedback problem. Learning goal-directed behavior with sparse feedback from complex environments is a fundamental challenge for reinforcement learning algorithms. In this paper we extend our previous work in this area to provide a solution to the problem setting stated above, using Hierarchical Reinforcement Learning (HRL) in a novel way that involves a meta controller for higher level goal assignment and a controller that determines the lower-level actions of the agent. Our meta controller is based on a regression model trained using a state transition scheme that defines the evolution of goal designation, whereas our lower-level controller is based on a Deep Q Network (DQN) and is trained via reinforcement learning iterations. This two-layer framework ensures that an optimal plan for a complex path, organized as multiple goals, is achieved gradually, through piecewise assignment of sub-goals, and thus as a result of a staged, efficient and rigorous procedure.
Reference51 articles.
1. Abell DC, Caraway III WD. A method for the determination of target aspect angle with respect to a radar, July, 1998.
2. Bertsekas DP. Reinforcement Learning and Optimal Control. Athena Scientific, Belmont, Massachusetts.
3. Bouhamed O, Ghazzai H, Besbes H, Massoud Y. Autonomous uav navigation: A ddpg-based deep reinforcement learning approach, 2020.
4. Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
5. Challita U, Saad W, Bettstetter C. Deep reinforcement learning for interference-aware path planning of cellular-connected uavs. In 2018 IEEE International Conference on Communications (ICC), 2018, pp. 1–7.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献