1. Optimization methods for large-scale machine learning;Bottou;SIAM Review,2018
2. Neither quick nor proper-evaluation of QuickProp for learning deep neural networks;Brust,2016
3. Cai, T., Luo, S., Xu, K., He, D., Liu, T.-y., & Wang, L. (2021). Graphnorm: A principled approach to accelerating graph neural network training. In International conference on machine learning (pp. 1204–1215).
4. Side channel attacks for architecture extraction of neural networks;Chabanne;CAAI Transactions on Intelligence Technology,2021
5. An image is worth 16 × 16 words: Transformers for image recognition at scale;Dosovitskiy,2020