Affiliation:
1. College of Mechanical and Energy Engineering, Beijing University of Technology, Beijing 100124, China
2. Capital Aerospace Machinery Co., Ltd., Beijing 100076, China
Abstract
It is very difficult for manufacturing enterprises to achieve automatic coordination of multiproject and multilevel planning when they are unable to make large-scale resource adjustments. In addition, planning and coordination work mostly relies on human experience, and inaccurate planning often occurs. This article innovatively proposes the PERT-RP-DDPGAO algorithm, which effectively combines the program evaluation and review technique (PERT) and deep deterministic policy gradient (DDPG) technology. Innovatively using matrix computing, the resource plan (RP) itself is used for the first time as an intelligent agent for reinforcement learning, achieving automatic coordination of multilevel plans. Through experiments, this algorithm can achieve automatic planning and has interpretability in management theory. To solve the problem of continuous control, the second half of the new algorithm adopts the DDPG algorithm, which has advantages in convergence and response speed compared to traditional reinforcement learning algorithms and heuristic algorithms. The response time of this algorithm is 3.0% lower than the traditional deep Q-network (DQN) algorithm and more than 8.4% shorter than the heuristic algorithm.
Funder
National Key Research and Development Program of China
Research Funds for Leading Talents Program
Research Project on the Development of Intelligent Manufacturing Strategy at the First Research Institute