Research on Intelligent Control Method of Launch Vehicle Landing Based on Deep Reinforcement Learning

Author:

Xue Shuai1,Bai Hongyang1ORCID,Zhao Daxiang2,Zhou Junyan2

Affiliation:

1. School of Energy and Power Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

2. School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China

Abstract

A launch vehicle needs to adapt to a complex flight environment during flight, and traditional guidance and control algorithms can hardly deal with multi-factor uncertainties due to the high dependency on control models. To solve this problem, this paper designs a new intelligent flight control method for a rocket based on the deep reinforcement learning algorithm driven by knowledge and data. In this process, the Markov decision process of the rocket landing section is established by designing a reinforcement function with consideration of the combination effect on the return of the terminal constraint of the launch vehicle and the cumulative return of the flight process of the rocket. Meanwhile, to improve the training speed of the landing process of the launch vehicle and to enhance the generalization ability of the model, the strategic neural network model is obtained and trained via the form of a long short-term memory (LSTM) network combined with a full connection layer as a landing guidance strategy network. The proximal policy optimization (PPO) is the training algorithm of reinforcement learning network parameters combined with behavioral cloning (BC) as the reinforcement learning pre-training imitation learning algorithm. Notably, the rocket-borne environment is transplanted to the Nvidia Jetson TX2 embedded platform for the comparative testing and verification of this intelligent model, which is then used to generate real-time control commands for guiding the actual flying and landing process of the rocket. Further, comparisons of the results obtained from convex landing optimization and the proposed method in this work are performed to prove the effectiveness of this proposed method. The simulation results show that the intelligent control method in this work can meet the landing accuracy requirements of the launch vehicle with a fast convergence speed of 84 steps, and the decision time is only 2.5 ms. Additionally, it has the ability of online autonomous decision making as deployed on the embedded platform.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Reference28 articles.

1. Analysis and reflection on the development history of manned launch vehicles at Home and abroad;Wu;Manned Spacefl.,2019

2. Optimal staging of reusable launch vehicles for minimum life cycle cost;Jo;Aerosp. Sci. Technol.,2022

3. Jones, H.W. (2018, January 8–12). The recent large reduction in space launch cost. Proceedings of the 48th International Conference on Environmental Systems, Albuquerque, NM, USA.

4. Terminal Phase Descent Trajectory Optimization of Reusable Launch Vehicle;Mukundan;IFAC-PapersOnLine,2022

5. Development of flight control technology for Long March launch vehicle;Song;J. Astronaut.,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3