1. Mohamed S. Abdelfattah, Abhinav Mehrotra, Łukasz Dudziak, and Nicholas D. Lane. 2021. Zero-Cost Proxies for Lightweight NAS. In International Conference on Learning Representations (ICLR).
2. Shun-Ichi Amari. 1998. Natural gradient works efficiently in learning. Neural computation, Vol. 10, 2 (1998), 251--276.
3. Kartikeya Bhardwaj, James Ward, Caleb Tung, Dibakar Gope, Lingchuan Meng, Igor Fedorov, Alex Chalfin, Paul Whatmough, and Danny Loh. 2022. Restructurable activation networks. arXiv preprint arXiv:2208.08562 (2022).
4. Léon Bottou and Yann Le Cun. 2005. On-line learning for very large data sets. Applied stochastic models in business and industry, Vol. 21, 2 (2005), 137--151.
5. Andrew Brock, Theo Lim, J.M. Ritchie, and Nick Weston. 2018. SMASH: One-Shot Model Architecture Search through HyperNetworks. In International Conference on Learning Representations.