1. On exponential convergence of sgd in non-convex over-parametrized learning;Bassily,2018
2. The loss landscape of overparameterized neural networks;Cooper,2018
3. Gradient descent provably optimizes over-parameterized neural networks;Du,2018
4. Loss landscapes and optimization in over-parameterized non-linear systems and neural networks;Liu;Appl Comput Harmon Anal,2022
5. Over-parameterized deep neural networks have no strict local minima for any continuous activations;Li,2018