1. D. Amodei at al. Concrete problems in AI safety. arXiv:1606.06565, 2016. https://arxiv.org/abs/1606.06565
2. B. Baker, O. Gupta, N. Naik, and R. Raskar. Designing neural network architectures using reinforcement learning. arXiv:1611.02167, 2016. https://arxiv.org/abs/1611.02167
3. J. Baxter, A. Tridgell, and L. Weaver. Knightcap: a chess program that learns by combining td (lambda) with game-tree search. arXiv cs/9901002, 1999.
4. M. Bellemare, Y. Naddaf, J. Veness, and M. Bowling. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47, pp. 253–279, 2013.
5. R. E. Bellman. Dynamic Programming. Princeton University Press, 1957.