1. Armenta, M.A., and Jodoin, P.M. (2020). The Representation Theory of Neural Networks. arXiv.
2. Neyshabur, B., Salakhutdinov, R., and Srebro, N. (2015). Advances in Neural Information Processing Systems 28 (NIPS 2015), MIT Press.
3. Meng, Q., Zheng, S., Zhang, H., Chen, W., Ye, Q., Ma, Z., Yu, N., and Liu, T. (2018). G-SGD: Optimizing ReLU Neural Networks in its Positively Scale-Invariant Space. arXiv.
4. Badrinarayanan, V., Mishra, B., and Cipolla, R. (2015). Understanding Symmetries in Deep Networks. arXiv.
5. Dinh, L., Pascanu, R., Bengio, S., and Bengio, Y. (2017, January 6–11). Sharp Minima Can Generalize for Deep Nets. Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia.