1. Distributional reinforcement learning for multi-dimensional reward functions;Zhang,2021
2. Rd$^{2}$: Reward decomposition with representation decomposition;Lin,2020
3. Hybrid reward architecture for reinforcement learning;Seijen,2017
4. HORDE: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction;Sutton,2011
5. Universal value function approximators;Schaul,2015