1. Tensorflow: a system for large-scale machine learning;Abadi,2016
2. On the convergence rate of training recurrent neural networks;Allen-Zhu,2018
3. A convergence theory for deep learning via over-parameterization;Allen-Zhu,2019
4. Dynamical isometry and a mean field theory of RNNs: gating enables signal propagation in recurrent neural networks;Chen,2018
5. On the global convergence of gradient descent for over-parameterized models using optimal transport;Chizat,2018