1. Bansal, T., Pachocki, J., Sidor, S., Sutskever, I., Mordatch, I.: Emergent complexity via multi-agent competition (2017). CoRR arXiv: 1710.03748
2. Baykal-Gürsoy, M., Gürsoy, K.: Semi-Markov decision processes: nonstandard criteria. Probab. Eng. Inf. Sci. 21(4), 635–657 (2007)
3. Christopher Berner, G.B., Chan, B., Cheung, V., Debiak, P., Dennison, C., Farhi, D., Fischer, Q., Hashme, S., Hesse, C., Józefowicz, R., Gray, S., Olsson, C., Pachocki, J., Petrov, M., de Oliveira Pinto, H.P., Raiman, J., Salimans, T., Schlatter, J., Schneider, J., Sidor, S., Sutskever, I., Tang, J., Wolski, F., Zhang, S.: Dota 2 with large scale deep reinforcement learning (2019). CoRR arXiv:abs/1912.06680
4. Korf, R.E.: Learning to solve problems by searching for macro-operators. Ph.D. thesis, USA (1983). AAI8425820
5. Mcgovern, A., Sutton, R.S., Fagg, A.H.: Roles of macro-actions in accelerating reinforcement learning. In: Grace Hopper Celebration of Women in Computing, pp. 13–18 (1997)