A Reinforcement Learning Approach for Scheduling Problems with Improved Generalization through Order Swapping-Reference-Cited by-同舟云学术

A Reinforcement Learning Approach for Scheduling Problems with Improved Generalization through Order Swapping

Published:2023-04-29 Issue:2 Volume:5 Page:418-430
ISSN:2504-4990
Container-title:Machine Learning and Knowledge Extraction
language:en
Short-container-title:MAKE

Author:

Vivekanandan Deepak¹^ORCID,Wirth Samuel²^ORCID,Karlbauer Patrick²^ORCID,Klarmann Noah²^ORCID

Affiliation:

1. ScaliRo GmbH, Eduard-Rüber-Straße 7, 83022 Rosenheim, Germany

2. Faculty of Management and Engineering, Rosenheim Technical University of Applied Sciences, Hochschulstraße 1, 83024 Rosenheim, Germany

Abstract

The scheduling of production resources (such as associating jobs to machines) plays a vital role for the manufacturing industry not only for saving energy, but also for increasing the overall efficiency. Among the different job scheduling problems, the Job Shop Scheduling Problem (JSSP) is addressed in this work. JSSP falls into the category of NP-hard Combinatorial Optimization Problem (COP), in which solving the problem through exhaustive search becomes unfeasible. Simple heuristics such as First-In, First-Out, Largest Processing Time First and metaheuristics such as taboo search are often adopted to solve the problem by truncating the search space. The viability of the methods becomes inefficient for large problem sizes as it is either far from the optimum or time consuming. In recent years, the research towards using Deep Reinforcement Learning (DRL) to solve COPs has gained interest and has shown promising results in terms of solution quality and computational efficiency. In this work, we provide an novel approach to solve the JSSP examining the objectives generalization and solution effectiveness using DRL. In particular, we employ the Proximal Policy Optimization (PPO) algorithm that adopts the policy-gradient paradigm that is found to perform well in the constrained dispatching of jobs. We incorporated a new method called Order Swapping Mechanism (OSM) in the environment to achieve better generalized learning of the problem. The performance of the presented approach is analyzed in depth by using a set of available benchmark instances and comparing our results with the work of other groups.

Funder

the Federal Ministry for Economic Affairs and Climate Action

Publisher

MDPI AG

Subject

General Economics, Econometrics and Finance

Link

https://www.mdpi.com/2504-4990/5/2/25/pdf

Reference39 articles.

1. Pinedo, M.L. (2012). Scheduling, Springer.

2. Learning to dispatch for job shop scheduling via deep reinforcement learning;Zhang;Adv. Neural Inf. Process. Syst.,2020

3. Sutton, R.S., and Barto, A.G. (2018). Reinforcement Learning: An Introduction, MIT Press.

4. Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., and Graepel, T. (2017). Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv.

5. Grandmaster level in StarCraft II using multi-agent reinforcement learning;Vinyals;Nature,2019

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Advanced Computational Methods for Modeling, Prediction and Optimization—A Review;Materials;2024-07-16

2. Reinforcement Learning for Reducing the Interruptions and Increasing Fault Tolerance in the Cloud Environment;Informatics;2023-08-02

3. Reinforcement Learning Approach for Optimizing Cloud Resource Utilization With Load Balancing;IEEE Access;2023