1. Amari, S., Ba, J., Grosse, R.B., Li, X., Nitanda, A., Suzuki, T., Wu, D., Xu, J.: When does preconditioning help or hurt generalization? In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=S724o4_WB3
2. Azulay, S., Moroshko, E., Nacson, M.S., Woodworth, B.E., Srebro, N., Globerson, A., Soudry, D.: On the implicit bias of initialization shape: beyond infinitesimal mirror descent. In: International Conference on Machine Learning, pp. 468–477. PMLR (2021)
3. Ba, J., Erdogdu, M.A., Suzuki, T., Wang, Z., Wu, D., Yang, G.: High-dimensional asymptotics of feature learning: how one gradient step improves the representation. In: Advances in Neural Information Processing Systems (2022). https://openreview.net/forum?id=akddwRG6EGi
4. Bartlett, P.L., Long, P.M., Lugosi, G., Tsigler, A.: Benign overfitting in linear regression. Proc. Natl. Acad. Sci. 117(48), 30063–30070 (2020)
5. Belkin, M., Hsu, D., Ma, S., Mandal, S.: Reconciling modern machine-learning practice and the classical bias-variance trade-off. Proc. Natl. Acad. Sci. 116(32), 15849–15854 (2019)