1. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., and Polosukhin, I. (2017). Attention is all you need. Adv. Neural Inf. Process. Syst., 30.
2. Radford, A., Narasimhan, K., Salimans, T., and Sutskever, I. (2024, April 03). Improving Language Understanding by Generative Pre-Training. Available online: https://api.semanticscholar.org/CorpusID:49313245.
3. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019, January 2–7). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the NAACL-HLT, Minneapolis, MN, USA.
4. Patil, R., and Gudivada, V. (2024). A Review of Current Trends, Techniques, and Challenges in Large Language Models (LLMs). App. Sci., 14.
5. Cheng, J. (2024). Applications of Large Language Models in Pathology. Bioengineering, 11.