1. Bertsekas DP (2019) Reinforcement learning and optimal control. Athena Scientific, Belmont
2. Gu S, Holly E, Lillicrap T, Levine S (2017) Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: International conference on robotics and automation. IEEE, pp 3389–3396
3. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
4. arXiv preprint arXiv:1312.5602;V Mnih,2013
5. Lectures in Mathematics ETH Zürich;G Peskir,2006