Smoothness-Adaptive Contextual Bandits-Reference-Cited by-同舟云学术

Smoothness-Adaptive Contextual Bandits

Published:2022-11 Issue:6 Volume:70 Page:3198-3216
ISSN:0030-364X
Container-title:Operations Research
language:en
Short-container-title:Operations Research

Author:

Gur Yonatan¹²^ORCID,Momeni Ahmadreza²^ORCID,Wager Stefan¹^ORCID

Affiliation:

1. Graduate School of Business, Stanford University, Stanford, California 94305;

2. Electrical Engineering Department, Stanford University, Stanford, California 94305

Abstract

In nonparametric contextual bandit formulations, a key complexity driver is the smoothness of payoff functions with respect to covariates. In many practical settings, the smoothness of payoffs is unknown, and misspecification of smoothness may severely deteriorate the performance of existing methods. In the paper “Smoothness-Adaptive Contextual Bandits,” Yonatan Gur, Ahmadreza Momeni, and Stefan Wager consider a framework where the smoothness of payoff functions is unknown and study when and how algorithms may adapt to unknown smoothness. First, they establish that designing algorithms that adapt to unknown smoothness is, in general, impossible. However, under a natural self-similarity condition, they establish that adapting to unknown smoothness is possible and devise a general policy for achieving smoothness-adaptive performance. The policy infers the smoothness of payoffs throughout the decision-making process while leveraging the structure of off-the-shelf nonadaptive policies. It matches (up to a logarithmic scale) the performance that is achievable when the smoothness of payoffs is known in advance.

Publisher

Institute for Operations Research and the Management Sciences (INFORMS)

Subject

Management Science and Operations Research,Computer Science Applications

Link

https://pubsonline.informs.org/doi/pdf/10.1287/opre.2021.2215

Reference41 articles.

1. The Continuum-Armed Bandit Problem

2. MNL-Bandit: A Dynamic Learning Approach to Assortment Selection

3. Optimal Inference in a Class of Regression Models

4. Fast learning rates for plug-in classifiers

5. Personalized Dynamic Pricing with Machine Learning: High-Dimensional Features and Heterogeneous Elasticity

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimal subgroup selection;The Annals of Statistics;2023-12-01