1. Attention is all you need;A Vaswani;Advances in neural information processing systems,2017
2. Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks;T Hoefler;Journal of Machine Learning Research,2021
3. Movement pruning: Adaptive sparsity by fine-tuning;V Sanh;Advances in neural information processing systems,2020