Research on Intelligent Control Method of Launch Vehicle Landing Based on Deep Reinforcement Learning-Reference-Cited by-同舟云学术

Research on Intelligent Control Method of Launch Vehicle Landing Based on Deep Reinforcement Learning

Published:2023-10-13 Issue:20 Volume:11 Page:4276
ISSN:2227-7390
Container-title:Mathematics
language:en
Short-container-title:Mathematics

Author:

Xue Shuai¹,Bai Hongyang¹^ORCID,Zhao Daxiang²,Zhou Junyan²

Affiliation:

1. School of Energy and Power Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

2. School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China

Abstract

A launch vehicle needs to adapt to a complex flight environment during flight, and traditional guidance and control algorithms can hardly deal with multi-factor uncertainties due to the high dependency on control models. To solve this problem, this paper designs a new intelligent flight control method for a rocket based on the deep reinforcement learning algorithm driven by knowledge and data. In this process, the Markov decision process of the rocket landing section is established by designing a reinforcement function with consideration of the combination effect on the return of the terminal constraint of the launch vehicle and the cumulative return of the flight process of the rocket. Meanwhile, to improve the training speed of the landing process of the launch vehicle and to enhance the generalization ability of the model, the strategic neural network model is obtained and trained via the form of a long short-term memory (LSTM) network combined with a full connection layer as a landing guidance strategy network. The proximal policy optimization (PPO) is the training algorithm of reinforcement learning network parameters combined with behavioral cloning (BC) as the reinforcement learning pre-training imitation learning algorithm. Notably, the rocket-borne environment is transplanted to the Nvidia Jetson TX2 embedded platform for the comparative testing and verification of this intelligent model, which is then used to generate real-time control commands for guiding the actual flying and landing process of the rocket. Further, comparisons of the results obtained from convex landing optimization and the proposed method in this work are performed to prove the effectiveness of this proposed method. The simulation results show that the intelligent control method in this work can meet the landing accuracy requirements of the launch vehicle with a fast convergence speed of 84 steps, and the decision time is only 2.5 ms. Additionally, it has the ability of online autonomous decision making as deployed on the embedded platform.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.mdpi.com/2227-7390/11/20/4276/pdf

Reference28 articles.

1. Analysis and reflection on the development history of manned launch vehicles at Home and abroad;Wu;Manned Spacefl.,2019

2. Optimal staging of reusable launch vehicles for minimum life cycle cost;Jo;Aerosp. Sci. Technol.,2022

3. Jones, H.W. (2018, January 8–12). The recent large reduction in space launch cost. Proceedings of the 48th International Conference on Environmental Systems, Albuquerque, NM, USA.

4. Terminal Phase Descent Trajectory Optimization of Reusable Launch Vehicle;Mukundan;IFAC-PapersOnLine,2022

5. Development of flight control technology for Long March launch vehicle;Song;J. Astronaut.,2020

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Towards an extensible model-based digital twin framework for space launch vehicles;Journal of Industrial Information Integration;2024-09