1. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale;Alexey Dosovitskiy,2020
2. Findings of the 2020 conference on machine translation (WMT20);Barrault,2020
3. Deep learning of representations: Looking forward;Bengio,2013
4. The Power of Scale for Parameter-Efficient Prompt Tuning;Lester,2021
5. Language models are few-shot learners;Brown,2020