1. David M. Blei , Andrew Y. Ng , and Michael I . Jordan . 2003 . Latent Dirichlet Allocation. In J. Mach. Learn. Res . David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet Allocation. In J. Mach. Learn. Res.
2. Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell etal 2020. Language Models are Few-Shot Learners. In NeurIPS. Tom Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared D Kaplan Prafulla Dhariwal Arvind Neelakantan Pranav Shyam Girish Sastry Amanda Askell et al. 2020. Language Models are Few-Shot Learners. In NeurIPS.
3. Kevin Clark , Minh-Thang Luong , Quoc V. Le , and Christopher D . Manning . 2020 . ELECTRA : Pre-training Text Encoders as Discriminators Rather Than Generators. In ICLR. Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators. In ICLR.
4. Jacob Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.
5. Tianyu Gao Adam Fisch and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. In ACL. Tianyu Gao Adam Fisch and Danqi Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. In ACL.