1. Kravaris T, Spatharis C, Bastas A, Vouros GA, Blekas K, Andrienko G, Andrienko N, Garcia JM. Resolving congestions in the air traffic management domain via multiagent reinforcement learning methods. 14 December 2019. arXiv preprint arXiv:1912.06860
2. Sutton RS, Barto AG. Reinforcement learning: An introduction. Cambridge, Massachusetts, USA: MIT Press; 13 November, 2018
3. Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature. 2015;518(7540):529-533
4. Tsitsiklis J, Van Roy B. Analysis of temporal-diffference learning with function approximation. Advances in Neural Information Processing Systems. 1996;9:1075-1081
5. Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N. Dueling network architectures for deep reinforcement learning. In: International Conference on Machine Learning. New York, NY, USA: JMLR: WCP; 11 June 2016. pp. 1995-2003