Bandits and Experts in Metric Spaces-Reference-Cited by-同舟云学术

Bandits and Experts in Metric Spaces

Published:2019-08-26 Issue:4 Volume:66 Page:1-77
ISSN:0004-5411
Container-title:Journal of the ACM
language:en
Short-container-title:J. ACM

Author:

Kleinberg Robert¹,Slivkins Aleksandrs²,Upfal Eli³

Affiliation:

1. Computer Science Department, Cornell University, Ithaca, NY, USA

2. Microsoft Research NYC, New York, NY, USA USA

3. Computer Science Department, Brown University, Providence, RI, USA

Abstract

In a multi-armed bandit problem, an online algorithm chooses from a set of strategies in a sequence of trials to maximize the total payoff of the chosen strategies. While the performance of bandit algorithms with a small finite strategy set is well understood, bandit problems with large strategy sets are still a topic of active investigation, motivated by practical applications, such as online auctions and web advertisement. The goal of such research is to identify broad and natural classes of strategy sets and payoff functions that enable the design of efficient solutions. In this work, we study a general setting for the multi-armed bandit problem, in which the strategies form a metric space, and the payoff function satisfies a Lipschitz condition with respect to the metric. We refer to this problem as the Lipschitz MAB problem . We present a solution for the multi-armed bandit problem in this setting. That is, for every metric space, we define an isometry invariant that bounds from below the performance of Lipschitz MAB algorithms for this metric space, and we present an algorithm that comes arbitrarily close to meeting this bound. Furthermore, our technique gives even better results for benign payoff functions. We also address the full-feedback (“best expert”) version of the problem, where after every round the payoffs from all arms are revealed.

Funder

NSF

ONR

Publisher

Association for Computing Machinery (ACM)

Subject

Artificial Intelligence,Hardware and Architecture,Information Systems,Control and Systems Engineering,Software

Link

https://dl.acm.org/doi/pdf/10.1145/3299873

Reference101 articles.

1. Name independent routing for growth bounded networks

2. The Continuum-Armed Bandit Problem

3. A Near-Optimal Exploration-Exploitation Approach for Assortment Selection

Cited by 21 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Sequential query prediction based on multi-armed bandits with ensemble of transformer experts and immediate feedback;Data Mining and Knowledge Discovery;2024-08-02

2. The Role of Transparency in Repeated First-Price Auctions with Unknown Valuations;Proceedings of the 56th Annual ACM Symposium on Theory of Computing;2024-06-10

3. Estimation and inference for minimizer and minimum of convex functions: Optimality, adaptivity and uncertainty principles;The Annals of Statistics;2024-02-01

4. The Power of Age-based Reward in Fresh Information Acquisition;IEEE INFOCOM 2023 - IEEE Conference on Computer Communications;2023-05-17

5. An Online Inference-Aided Incentive Framework for Information Elicitation Without Verification;IEEE Journal on Selected Areas in Communications;2023-04