Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information-Reference-Cited by-同舟云学术

Dynamic Pricing and Inventory Control with Fixed Ordering Cost and Incomplete Demand Information

Published:2021-12-14 Issue: Volume: Page:
ISSN:0025-1909
Container-title:Management Science
language:en
Short-container-title:Management Science

Author:

Chen Boxiao¹^ORCID,Simchi-Levi David²^ORCID,Wang Yining³^ORCID,Zhou Yuan⁴⁵

Affiliation:

1. College of Business Administration, University of Illinois, Chicago, Illinois 60607;

2. Institute for Data, Systems and Society, Operations Research Center, Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139;

3. Warrington College of Business, University of Florida, Gainesville, Florida 32611;

4. Department of Industrial & Enterprise Systems Engineering, University of Illinois, Urbana-Champaign, Illinois 61801;

5. Yanqi Lake Beijing Institute of Mathematical Science and Applications, Beijing 101408, China

Abstract

We consider the periodic review dynamic pricing and inventory control problem with fixed ordering cost. Demand is random and price dependent, and unsatisfied demand is backlogged. With complete demand information, the celebrated [Formula: see text] policy is proved to be optimal, where s and S are the reorder point and order-up-to level for ordering strategy, and [Formula: see text], a function of on-hand inventory level, characterizes the pricing strategy. In this paper, we consider incomplete demand information and develop online learning algorithms whose average profit approaches that of the optimal [Formula: see text] with a tight [Formula: see text] regret rate. A number of salient features differentiate our work from the existing online learning researches in the operations management (OM) literature. First, computing the optimal [Formula: see text] policy requires solving a dynamic programming (DP) over multiple periods involving unknown quantities, which is different from the majority of learning problems in OM that only require solving single-period optimization questions. It is hence challenging to establish stability results through DP recursions, which we accomplish by proving uniform convergence of the profit-to-go function. The necessity of analyzing action-dependent state transition over multiple periods resembles the reinforcement learning question, considerably more difficult than existing bandit learning algorithms. Second, the pricing function [Formula: see text] is of infinite dimension, and approaching it is much more challenging than approaching a finite number of parameters as seen in existing researches. The demand-price relationship is estimated based on upper confidence bound, but the confidence interval cannot be explicitly calculated due to the complexity of the DP recursion. Finally, because of the multiperiod nature of [Formula: see text] policies the actual distribution of the randomness in demand plays an important role in determining the optimal pricing strategy [Formula: see text], which is unknown to the learner a priori. In this paper, the demand randomness is approximated by an empirical distribution constructed using dependent samples, and a novel Wasserstein metric-based argument is employed to prove convergence of the empirical distribution. This paper was accepted by J. George Shanthikumar, big data analytics.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

Management Science and Operations Research,Strategy and Management

Reference54 articles.

1. The Big Data Newsvendor: Practical Insights from Machine Learning

2. On Implications of Demand Censoring in the Newsvendor Problem

3. Dynamic Pricing Without Knowing the Demand Function: Risk Bounds and Near-Optimal Algorithms

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning-based dynamic pricing strategy with pay-per-chapter mode for online publisher with case study of COL;Decision Support Systems;2024-11

2. Simple heuristics for the joint inventory and pricing models with fixed replenishment costs;Journal of the Operational Research Society;2024-07-19

3. Partial Backorder Inventory System: Asymptotic Optimality and Demand Learning;SSRN Electronic Journal;2024

4. Multiproduct Inventory Systems with Upgrading: Replenishment, Allocation, and Online Learning;SSRN Electronic Journal;2024

5. An Online Mirror Descent Learning Algorithm for Multiproduct Inventory Systems;SSRN Electronic Journal;2024