1. Asis, K.D., Hernandez-Garcia, J.F., Holland, G.Z., Sutton, R.S.: Multi-step reinforcement learning: A unifying algorithm. CoRR abs/1703.01327 (2017)
2. Bellemare, M.G., Dabney, W., Munos, R.: A distributional perspective on reinforcement learning. CoRR abs/1707.06887 (2017)
3. Bellemare, M.G., Naddaf, Y., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. In: Proceedings of the 24th International Conference on Artificial Intelligence, pp. 4148–4152. AAAI Press (2015)
4. Bellman, R., Corporation, R., Collection, K.M.R.: Dynamic Programming. Rand Corporation research study, Princeton University Press (1957)
5. Bengio, Y.: Learning deep architectures for ai. Found. Trends Mach. Learn. 2(1), 1–127 (2009). https://doi.org/10.1561/2200000006