1. Pr-darts: Pruning-based differentiable architecture search;mousavi;ArXiv Preprint,2022
2. Surgical fine-tuning improves adaptation to distribution shifts;lee;ArXiv Preprint,2022
3. signsgd: Compressed optimisation for non-convex problems;bernstein;International Conference on Machine Learning,2018
4. Qsgd: Communication-efficient sgd via gradient quantization and encoding;alistarh;Advances in neural information processing systems,2017
5. On-device training under 256kb memory;lin;ArXiv Preprint,2022