1. Achiam J, Knight E, Abbeel P (2019) Towards characterizing divergence in deep q-learning. arXiv:1903.08894
2. Barto A, Sutton R, Anderson C (1983) Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans Syst Man Cybern 5:833–836
3. Boyan J, Moore A (1995) Generalization in reinforcement learning: Safely approximating the value function. NIPS-7. San Mateo, CA: Morgan Kaufmann
4. Brady T, Paschall S (2010) The challenge of safe lunar landing. IEEE Aerospace Conference. IEEE
5. Brockman G, Cheung V, Pettersson L, Schneider J, Schulman J, Tang J, Zaremba W (2016) Openai gym