1. Jacot, A., Gabriel, F. & Hongler, C. Neural tangent kernel: convergence and generalization in neural networks. Adv. Neural Inf. Process. Syst. 31, 8571–8580 (2018).
2. de G. Matthews, A. G., Hron, J., Rowland, M., Turner, R. E. & Ghahramani, Z. Gaussian process behaviour in wide deep neural networks. In International Conference on Learning Representations (2018).
3. Naveh, G., Ben David, O., Sompolinsky, H. & Ringel, Z. Predicting the outputs of finite deep neural networks trained with noisy gradients. Phys. Rev. E 104, 064301 (2021).
4. Li, Y., Yosinski, J., Clune, J., Lipson, H. & Hopcroft, J. Convergent learning: Do different neural networks learn the same representations? In Proc. 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, vol. 44 of Proc. Machine Learning Research (eds Storcheus, D., Rostamizadeh, A. & Kumar, S.) 196–212 (PMLR, Montreal, Canada, 2015).
5. Chizat, L., Oyallon, E. & Bach, F. On lazy training in differentiable programming. Adv. Neural Inf. Process. Syst. 32, 2937–2947 (2019).