Affiliation:
1. Research and Development Group, Hitachi Ltd., Tokyo, Japan
Abstract
This paper introduces a method to solve the 2D irregular packing problem using Deep Reinforcement Learning (Deep RL) for logistics. Our method employs a Q agent trained to predict the best placement within a container, maximizing available space. Unlike previous Deep RL algorithms, our method introduces a dense reward function at each packing step, providing immediate feedback and accelerating learning. To our knowledge, this is the first approach to use a dense reward to address the 2D irregular packing problem. Building on our earlier work, we improve the deep neural network by incorporating the Double Deep Q-Network (DDQN) framework to enhance our deep Q-learning approach, reducing overestimation biases and improving decision-making reliability. Simulation results show the method’s effectiveness in completing the online 2D irregular packing tasks, achieving promising volume efficiency and packed piece metrics. This research extends our initial findings, highlighting the practical importance of DDQN and dense reward in advancing 2D irregular packing problem-solving. These advancements not only broaden the applications of deep learning but also hold practical importance for real-world logistics challenges.
Publisher
World Scientific Pub Co Pte Ltd