Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization-Reference-Cited by-同舟云学术

Online Learning via Offline Greedy Algorithms: Applications in Market Design and Optimization

Published:2022-10-27 Issue: Volume: Page:
ISSN:0025-1909
Container-title:Management Science
language:en
Short-container-title:Management Science

Author:

Niazadeh Rad¹^ORCID,Golrezaei Negin²^ORCID,Wang Joshua³,Susan Fransisca²,Badanidiyuru Ashwinkumar³

Affiliation:

1. Operations Management, Chicago Booth School of Business, Chicago, Illinois 60637;

2. Operations Management, MIT Sloan School of Management, Cambridge, Massachusetts 02142;

3. Google Research Mountain View, Mountain View, California 94043

Abstract

Motivated by online decision making in time-varying combinatorial environments, we study the problem of transforming offline algorithms to their online counterparts. We focus on offline combinatorial problems that are amenable to a constant factor approximation using a greedy algorithm that is robust to local errors. For such problems, we provide a general framework that efficiently transforms offline robust greedy algorithms to online ones using Blackwell approachability. We show that the resulting online algorithms have [Formula: see text] (approximate) regret under the full information setting. We further introduce a bandit extension of Blackwell approachability that we call Bandit Blackwell approachability. We leverage this notion to transform greedy robust offline algorithms into a [Formula: see text] (approximate) regret in the bandit setting. Demonstrating the flexibility of our framework, we apply our offline-to-online transformation to several problems at the intersection of revenue management, market design, and online optimization, including product ranking optimization in online platforms, reserve price optimization in auctions, and submodular maximization. We also extend our reduction to greedy-like first-order methods used in continuous optimization, such as those used for maximizing continuous strong DR monotone submodular functions subject to convex constraints. We show that our transformation, when applied to these applications, leads to new regret bounds or improves the current known bounds. We complement our theoretical studies by conducting numerical simulations for two of our applications, in both of which we observe that the numerical performance of our transformations outperforms the theoretical guarantees in practical instances. This paper was accepted by George Shanthikumar, data science.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

Management Science and Operations Research,Strategy and Management

Link

http://pubsonline.informs.org/doi/pdf/10.1287/mnsc.2022.4558

Reference33 articles.

1. MNL-Bandit: A Dynamic Learning Approach to Assortment Selection

2. Maximizing a class of submodular utility functions

3. Optimal auctions vs. anonymous pricing

4. Display Optimization for Vertically Differentiated Locations Under Multinomial Logit Preferences

5. Position Auctions with Consumer Search

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Wasserstein gradient flow for optimal probability measure decomposition;SSRN Electronic Journal;2024

2. Solving Optimization Problems with Blackwell Approachability;Mathematics of Operations Research;2023-05-16

3. Contextual Bandits with Cross-Learning;Mathematics of Operations Research;2022-09-30