1. Enriching word vectors with subword information;Bojanowski;Transactions of the Association for Computational Linguistics,2017
2. Language models are few-shot learners;Brown,2020
3. Transformer-xl: Attentive language models beyond a fixed-length context;Dai,2019
4. Rlprompt: Optimizing discrete text prompts with reinforcement learning;Deng,2022
5. Bert: Pre-training of deep bidirectional transformers for language understanding;Devlin,2018