1. An, J., Lu, J., Ying, L.: Stochastic modified equations for the asynchronous stochastic gradient descent. Inf. Inference J. IMA 9(4), 851–873 (2020)
2. An, J., Ying, L., Zhu, Y.: Why resampling outperforms reweighting for correcting sampling bias with stochastic gradients (2020). arXiv:2009.13447
3. Baird, L.: Residual algorithms: reinforcement learning with function approximation. In: Machine Learning Proceedings 1995, pp. 30–37. Elsevier (1995)
4. Bradtke, S.: Reinforcement learning applied to linear quadratic regulation. In: Advances in Neural Information Processing Systems, p. 5 (1992)
5. Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., Zaremba, W.: Openai gym (2016). arXiv:1606.01540