1. Attention is all you need;vaswani;NeurIPS,2017
2. Training data-efficient image transformers & distillation through attention;touvron,2020
3. The Dynamic Representation of Scenes
4. Stand-alone self-attention in vision models;ramachandran;NeurIPS,2019
5. Designing Network Design Spaces