1. “On the Theory of Policy Gradient Methods: Optimality, Approximation, and Distribution Shift,”;Agarwal A.;The Journal of Machine Learning Research,2021
2. Agarwal, R., Schuurmans, D., and Norouzi, M. (2020), “An Optimistic Perspective on Offline Reinforcement Learning,” in International Conference on Machine Learning, pp. 104–114, PMLR.
3. Agu, E., Pedersen, P., Strong, D., Tulu, B., He, Q., Wang, L., and Li, Y., (2013), “The Smartphone as a Medical Device: Assessing Enablers, Benefits and Challenges,” in 2013 IEEE International Workshop of Internet-of-Things Networking and Control (IoT-NC), pp. 48–52. IEEE. DOI: 10.1109/IoT-NC.2013.6694053.
4. Antos, A., Szepesvári, C., and Munos, R. (2008), “Fitted q-iteration in Continuous Action-Space MDPS,” in Advances in Neural Information Processing Systems (Vol. 20), eds. J. Platt, D. Koller, Y. Singer, and S. Roweis, Curran Associates, Inc.
5. “Distributed Testing and Estimation Under Sparse High Dimensional Models,”;Battey H.;Annals of Statistics,2018