1. Learning from suboptimal demonstration via self-supervised reward regression;chen;Conference on Robot Learning,0
2. Model-Free Preference-Based Reinforcement Learning
3. Exploration by random network distillation;burda;the Seventh International Conference on Learning Representations,0
4. On Learning From Game Annotations
5. Deep rein- forcement learning in a handful of trials using probabilistic dynamics models;chua;Advances in neural information processing systems,2018