1. Optimality and approximation with policy gradient methods in markov decision processes;Agarwal,2020
2. Deep reinforcement learning at the edge of the statistical precipice;Agarwal,2021
3. Towards a simple approach to multi-step model-based reinforcement learning;Asadi,2018
4. World discovery models;Azar,2019
5. Never give up: Learning directed exploration strategies;Badia,2020