1. Hindsight task relabelling: Experience replay for sparse reward meta-RL;packer;Proc Adv Neural Inf Process Syst,0
2. Hindsight experience replay;andrychowicz;Proc 31st Int Conf Neural Inf Process Syst,0
3. Temporal difference models: Model-free deep RL for model-based control;pong;Proc Int Conf Learn Representations,0
4. World model as a graph: Learning latent landmarks for planning;zhang;Proc Int Conf Mach Learn,0
5. Robosuite: A modular simulation framework and benchmark for robot learning;zhu,2020