1. Fine-tuning can distort pretrained features and underperform out-of-distribution;kumar;ArXiv Preprint,2022
2. Training skinny deep neural networks with iterative hard thresholding methods;jin;ArXiv Preprint,2016
3. Top-Kast: Top-K always sparse training;jayakumar;Conference on Neural Information Processing Systems (NeurIPS),0
4. Mobilenets: Efficient convolutional neural networks for mobile vision applications;g howard;ArXiv Preprint,2017