The Efficacy of Pessimism in Asynchronous Q-Learning-Reference-Cited by-同舟云学术

The Efficacy of Pessimism in Asynchronous Q-Learning

Author:

Yan Yuling¹^ORCID,Li Gen²,Chen Yuxin²^ORCID,Fan Jianqing³^ORCID

Affiliation:

1. Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA, USA

2. Department of Statistics and Data Science, The Wharton School, University of Pennsylvania, Philadelphia, PA, USA

3. Department of Operations Research and Financial Engineering, Princeton University, Princeton, NJ, USA

Funder

Charlotte Elizabeth Procter Honorific Fellowship from Princeton University

Alfred P. Sloan Research Fellowship

Google Research Scholar Award

Air Force Office of Scientific Research

Office of Naval Research

NSF

ONR

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Subject

Library and Information Sciences,Computer Science Applications,Information Systems

Link

Reference65 articles.

1. Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning;li;Inf Inference J IMA,2022

2. Settling the sample complexity of online reinforcement learning;zhang;arXiv 2307 13586,2023

3. Almost optimal model-free reinforcement learning via reference-advantage decomposition;zhang;Proc Adv Neural Inf Process Syst,2020

4. Breaking the sample size barrier in model-based reinforcement learning with a generative model;li;Proc Adv Neural Inf Process Syst,2020