1. Towards understanding sharpness-aware minimization;Andriushchenko,2022
2. Nonlinear acceleration of momentum and primal-dual algorithms;Bollapragada;Mathematical Programming,2022
3. Accelerated linear convergence of stochastic momentum methods in wasserstein distances;Can,2019
4. Chen, X., Hsieh, C.-J., & Gong, B. (2022). When Vision Transformers Outperform ResNets without Pre-training or Strong Data Augmentations. In International conference on learning representation.
5. Chen, X., Liu, S., Sun, R., & Hong, M. (2019). On the Convergence of A Class of Adam-Type Algorithms for Non-Convex Optimization. In International conference on learning representations.