1. Landers M, Doryab A (2023) Deep reinforcement learning verification: a survey. ACM Comput Surv 55(14s):1–14. https://doi.org/10.1145/3596444
2. Salimans T, Ho J, Chen X, Sidor S, Sutskever I (2017) Evolution strategies as a scalable alternative to reinforcement learning. arXiv:1703.03864
3. Such FP, Madhavan V, Conti E, Lehman J, Stanley KO, Clune J (2017) Deep neuroevolution—genetic algorithms are a competitive alternative for training deep neural networks for reinforcement learning. arXiv
4. Schulman J, Levine S, Abbeel P, Jordan MI, Moritz P (2015) Trust region policy optimization. arXiv
5. Schulman J, Wolski F, Dhariwal P, Radford A, Klimov O (2017) Proximal policy optimization algorithms. arXiv