Abstract
AbstractThis article proposes a hybrid algorithm based on reinforcement learning and the inventory management methodology called DDMRP (Demand Driven Material Requirement Planning) to determine the optimal time to buy a certain product, and how much quantity should be requested. For this, the inventory management problem is formulated as a Markov Decision Process where the environment with which the system interacts is designed from the concepts raised in the DDMRP methodology, and through the reinforcement learning algorithm—specifically, Q-Learning. The optimal policy is determined for making decisions about when and how much to buy. To determine the optimal policy, three approaches are proposed for the reward function: the first one is based on inventory levels; the second is an optimization function based on the distance of the inventory to its optimal level, and the third is a shaping function based on levels and distances to the optimal inventory. The results show that the proposed algorithm has promising results in scenarios with different characteristics, performing adequately in difficult case studies, with a diversity of situations such as scenarios with discontinuous or continuous demand, seasonal and non-seasonal behavior, and with high demand peaks, among others.
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Industrial and Manufacturing Engineering,Software
Reference48 articles.
1. Abdelhalim, A., Hamid, A., & Tiente, H. (2021). Optimisation of the automated buffer positioning model under DDMRP logic. IFAC-PapersOnLine, 54(1), 582–588.
2. Aguilar, J., Dos Santos, R., García, R., Gómez, C., Jerez, M., Jiménez, M., & Puerto, E. (2022). A smart DDMRP model using machine learning techniques. International Journal of Value Chain Management
3. Azzamouri, A., Baptiste, P., Dessevre, G., & Pellerin, R. (2021). Demand driven material requirements planning (DDMRP): a systematic review and classification. Journal of Industrial Engineering and Management, 14(3), 439–456.
4. Aguilar, J. (2001). A general ant colony model to solve combinatorial optimization problems. Revista Colombiana De Computación, 2(1), 7–18.
5. Bonato, V., Mazzotti, B., Fernandes, M., & Marques, E. (2013). A mersenne twister hardware implementation for the Monte Carlo localization algorithm. Journal of Signal Processing Systems for Signal, Image & Video Technology, 70(1), 75–85.
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献