1. DeiT: Data-efficient image transformers;touvron;ArXiv Preprint,2020
2. Training data-efficient image transformers & distillation through attention;touvron;ArXiv Preprint,2020
3. Efficientnet: Rethinking model scaling for convolutional neural networks;tan;ArXiv Preprint,2019
4. Attention is all you need;vaswani;ArXiv Preprint,2017