Deep Reinforcement Learning That Matters-Reference-Cited by-同舟云学术

Deep Reinforcement Learning That Matters

Published:2018-04-29 Issue:1 Volume:32 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Henderson Peter,Islam Riashat,Bachman Philip,Pineau Joelle,Precup Doina,Meger David

Abstract

In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL). Reproducing existing work and accurately judging the improvements offered by novel methods is vital to sustaining this progress. Unfortunately, reproducing results for state-of-the-art deep RL methods is seldom straightforward. In particular, non-determinism in standard benchmark environments, combined with variance intrinsic to the methods, can make reported results tough to interpret. Without significance metrics and tighter standardization of experimental reporting, it is difficult to determine whether improvements over the prior state-of-the-art are meaningful. In this paper, we investigate challenges posed by reproducibility, proper experimental techniques, and reporting procedures. We illustrate the variability in reported metrics and results when comparing against common baselines and suggest guidelines to make future results in deep RL more reproducible. We aim to spur discussion about how to ensure continued progress in the field by minimizing wasted effort stemming from results that are non-reproducible and easily misinterpreted.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 330 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Workflow scheduling based on asynchronous advantage actor–critic algorithm in multi-cloud environment;Expert Systems with Applications;2024-12

2. Simulation-based evaluation of model-free reinforcement learning algorithms for quadcopter attitude control and trajectory tracking;Neurocomputing;2024-12

3. Age of information minimization in UAV-assisted data harvesting networks by multi-agent deep reinforcement curriculum learning;Expert Systems with Applications;2024-12

4. Predicting adolescent psychopathology from early life factors: A machine learning tutorial;Global Epidemiology;2024-12

5. Dynamic preference inference network: Improving sample efficiency for multi-objective reinforcement learning by preference estimation;Knowledge-Based Systems;2024-11