1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. arXiv.
2. Howard, J., and Ruder, S. (2018). Universal language model fine-tuning for text classification. arXiv.
3. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2023, July 18). Improving Language Understanding by Generative pre-Training. Available online: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf.
4. Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv.
5. Geramifard, A. (2023, July 18). Project CAIRaoke: Building the Assistants of the Future with Breakthroughs in Conversational AI. Available online: https://ai.facebook.com/blog/project-cairaoke/.