Can Deep Reinforcement Learning Improve Inventory Management? Performance on Lost Sales, Dual-Sourcing, and Multi-Echelon Problems-Reference-Cited by-同舟云学术

Can Deep Reinforcement Learning Improve Inventory Management? Performance on Lost Sales, Dual-Sourcing, and Multi-Echelon Problems

Published:2022-05 Issue:3 Volume:24 Page:1349-1368
ISSN:1523-4614
Container-title:Manufacturing & Service Operations Management
language:en
Short-container-title:M&SOM

Author:

Gijsbrechts Joren¹^ORCID,Boute Robert N.²³^ORCID,Van Mieghem Jan A.⁴^ORCID,Zhang Dennis J.⁵^ORCID

Affiliation:

1. Universidade Católica Portuguesa, Católica Lisbon School of Business and Economics, 1649-023 Lisbon, Portugal;

2. Vlerick Business School, Technology and Operations Management Area, 3000 Leuven, Belgium;

3. Katholieke Universiteit Leuven, Research Center for Operations Management, 3000 Leuven, Belgium;

4. Kellogg School of Management, Northwestern University, Evanston, Illinois 60208;

5. Olin Business School, Washington University in St. Louis, St. Louis, Missouri 63130

Abstract

Problem definition: Is deep reinforcement learning (DRL) effective at solving inventory problems? Academic/practical relevance: Given that DRL has successfully been applied in computer games and robotics, supply chain researchers and companies are interested in its potential in inventory management. We provide a rigorous performance evaluation of DRL in three classic and intractable inventory problems: lost sales, dual sourcing, and multi-echelon inventory management. Methodology: We model each inventory problem as a Markov decision process and apply and tune the Asynchronous Advantage Actor-Critic (A3C) DRL algorithm for a variety of parameter settings. Results: We demonstrate that the A3C algorithm can match the performance of the state-of-the-art heuristics and other approximate dynamic programming methods. Although the initial tuning was computationally demanding and time demanding, only small changes to the tuning parameters were needed for the other studied problems. Managerial implications: Our study provides evidence that DRL can effectively solve stationary inventory problems. This is especially promising when problem-dependent heuristics are lacking. Yet, generating structural policy insight or designing specialized policies that are (ideally provably) near optimal remains desirable.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

Management Science and Operations Research,Strategy and Management

Link

https://pubsonline.informs.org/doi/pdf/10.1287/msom.2021.1064

Reference38 articles.

1. The theory of dynamic programming

2. Hyperopt: a Python library for model selection and hyperparameter optimization

3. Information Relaxations, Duality, and Convex Stochastic Dynamic Programs

Cited by 60 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Scalable policies for the dynamic traveling multi-maintainer problem with alerts;European Journal of Operational Research;2024-11

2. Contextual reinforcement learning for supply chain management;Expert Systems with Applications;2024-09

3. An analysis of multi-agent reinforcement learning for decentralized inventory control systems;Computers & Chemical Engineering;2024-09

4. Managing flexibility in supply chains: mathematical analysis of dual sourcing systems;IMA Journal of Management Mathematics;2024-08-21

5. Benefits, challenges, and limitations of inventory control using machine learning algorithms: literature review;OPSEARCH;2024-08-15