Affiliation:
1. Harbin Institute of Technology, 150001 Harbin, People’s Republic of China
Abstract
This paper presents a reinforcement learning (RL) based approach for autonomous maneuver planning of low-altitude flybys for site-specific reconnaissance of small bodies. Combined with Monte Carlo tree search and deep neural networks, the proposed method generates optimal maneuvers, even under complex dynamics and abstractly science goals. Formulating the mission objective as an observability function, the RL issue can be framed in terms of a Markov decision process. The neural network, trained by a novel policy gradient algorithm with a clipped surrogate objective, learns both policy and value functions that map the action and state spaces to the expected long-term return. An adaptive refinement search technique is applied to further enhance the trained policy network, finding optimal maneuvers from the policy distributions. Experiment results on a simulated reconnaissance mission around asteroid Itokawa illustrate the efficiency and robustness of the proposed approach in achieving multitarget observation.
Funder
National Natural Science Foundation of China
Publisher
American Institute of Aeronautics and Astronautics (AIAA)