1. Backward feature correction: How deep learning performs deep (hierarchical) learning;Allen-Zhu,2023
2. Becigneul, G., & Ganea, O.-E. (2019). Riemannian Adaptive Optimization Methods. In International conference on learning representations.
3. Learning deep architectures for AI;Bengio;Foundations and trends in Machine Learning,2009
4. Representation learning: A review and new perspectives;Bengio;IEEE Transactions on Pattern Analysis and Machine Intelligence,2013
5. A representer theorem for deep kernel learning;Bohn;Journal of Machine Learning Research,2019