Orbital Interception Pursuit Strategy for Random Evasion Using Deep Reinforcement Learning

Author:

Jiang Rui1,Ye Dong1,Xiao Yan1,Sun Zhaowei1,Zhang Zeming12

Affiliation:

1. Research Center of Satellite Technology, School of Astronautics, Harbin Institute of Technology, Harbin, China.

2. Space Missions Engineering Lab, Department of Aerospace Science and Technology, Politecnico di Milano, Milano, Italy.

Abstract

Aiming at the interception problem of noncooperative evader spacecraft adopting random maneuver strategy in one-to-one orbital pursuit–evasion problem, an interception strategy with decision-making training mechanism for the pursuer based on deep reinforcement learning is proposed. Its core purpose is to improve the success rate of interception in the environment with high uncertainty. First of all, a multi-impulse orbit transfer model of pursuer and evader is established, and a modular deep reinforcement learning training method is built. Second, an effective reward mechanism is proposed to train the pursuer to choose the impulse direction and impulse interval of the orbit transfer and to learn the successful interception strategy with the optimal fuel and time. Finally, with the evader taking a random maneuver decision in each episode of training, the trained decision-making strategy is applied to the pursuer, the corresponding interception success rate of which is further analyzed. The results show that the pursuer trained can obtain universal and variable interception strategy. In each round of pursuit–evasion, with random maneuver strategy of the evader, the pursuer can adopt similar optimal decisions to deal with high-dimensional environments and thoroughly random state space, maintaining high interception success rate.

Publisher

American Association for the Advancement of Science (AAAS)

Subject

General Medicine

Reference23 articles.

1. Li YL. The attack orbit optimization of space attack and defense. Harbin, China: Harbin Engineering University; 2018.

2. Solution space exploration of low-thrust minimum-time trajectory optimization by combining two homotopies;Jingrui Z;Automatica,2023

3. Optimization of attacking orbit for interception satellite with low continuous thrust[J];Zhao L;Opt Precis Eng,2016

4. Shi M. Research on orbital optimization and control of interception satellite. Harbin, China: Harbin Institute of Technology; 2015.

5. A fuzzy controller for terminal approach of autonomous rendezvous and docking with non-cooperative target;Chen T;J Astronaut,2006

Cited by 6 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3