Optimization of 2D Irregular Packing: Deep Reinforcement Learning with Dense Reward-Reference-Cited by-同舟云学术

Optimization of 2D Irregular Packing: Deep Reinforcement Learning with Dense Reward

Published:2024-05-27 Issue: Volume: Page:1-12
ISSN:1793-351X
Container-title:International Journal of Semantic Computing
language:en
Short-container-title:Int. J. Semantic Computing

Author:

Crescitelli Viviana¹^ORCID,Oshima Takashi¹^ORCID

Affiliation:

1. Research and Development Group, Hitachi Ltd., Tokyo, Japan

Abstract

This paper introduces a method to solve the 2D irregular packing problem using Deep Reinforcement Learning (Deep RL) for logistics. Our method employs a Q agent trained to predict the best placement within a container, maximizing available space. Unlike previous Deep RL algorithms, our method introduces a dense reward function at each packing step, providing immediate feedback and accelerating learning. To our knowledge, this is the first approach to use a dense reward to address the 2D irregular packing problem. Building on our earlier work, we improve the deep neural network by incorporating the Double Deep Q-Network (DDQN) framework to enhance our deep Q-learning approach, reducing overestimation biases and improving decision-making reliability. Simulation results show the method’s effectiveness in completing the online 2D irregular packing tasks, achieving promising volume efficiency and packed piece metrics. This research extends our initial findings, highlighting the practical importance of DDQN and dense reward in advancing 2D irregular packing problem-solving. These advancements not only broaden the applications of deep learning but also hold practical importance for real-world logistics challenges.

Publisher

World Scientific Pub Co Pte Ltd

Link

https://www.worldscientific.com/doi/pdf/10.1142/S1793351X24430025