1. Residual Algorithms: Reinforcement Learning with Function Approximation
2. Beattie C., Leibo J. Z., Teplyashin D., Ward T., Wainwright M., Küttler H., Lefrancq A., Green S., Valdés V., Sadik A., Schrittwieser J., Anderson K., York S., Cant M., Cain A., Bolton A., Gaffney S., King H., Hassabis D., Petersen S. (2016). Deepmind lab. arXiv preprint arXiv:1612.03801.
3. The Arcade Learning Environment: An Evaluation Platform for General Agents
4. Brockman G., Cheung V., Pettersson L., Schneider J., Schulman J., Tang J., Zaremba W. (2016). Openai gym. arXiv preprint arXiv:1606.01540.
5. Chen L., Lu K., Rajeswaran A., Lee K., Grover A., Laskin M., Abbeel P., Srinivas A., Mordatch I. (2021). Decision transformer: Reinforcement learning via sequence modeling. arXiv preprint arXiv:2106.01345.