1. Altman, E.: Constrained Markov decision processes, vol. 7. CRC Press (1999)
2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
3. Chaslot, G., Bakkes, S., Szita, I., Spronck, P.: Monte-carlo tree search: a new framework for game AI. In: AIIDE (2008)
4. Frans, K., Ho, J., Chen, X., Abbeel, P., Schulman, J.: Meta learning shared hierarchies. arXiv preprint arXiv:1710.09767 (2017)
5. Gruslys, A., Dabney, W., Azar, M.G., Piot, B., Bellemare, M., Munos, R.: The reactor: A fast and sample-efficient actor-critic agent for reinforcement learning. arXiv preprint arXiv:1704.04651 (2017)