1. Non-convex Optimization for Machine Learning
2. Du, S., Lee, J., Li, H., et al. (2019) Gradient Descent Finds Global Minima of Deep Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, 28 May 2019, 1675-1685.
3. The effective noise of stochastic gradient descent
4. Huang, F., Gao, S., Pei, J., et al. (2022) Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization. Journal of Machine Learning Research, 23, 1616-1685.
5. Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs