1. [1] Ashish Vaswani, Noam Shazeer, Niki Parmar,et al.”Attention Is All You Need”. 31st International Conference on Neural Information Processing Systems(NeurIPS), no. 07 (2023): 6000–6010.
2. [2] Touvron, Hugo, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov et al. "Llama 2: Open foundation and fine-tuned chat models." arXiv preprint arXiv:2307.09288 (2023).
3. [3] Lingling Xu, Haoran Xie, Si-Zhao Joe Qin, et al.” Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment”. Nature Machine Intelligence, no. 05 (2023): 220-235.
4. [4] Hu, Edward J., Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. "Lora: Low-rank adaptation of large language models." arXiv preprint arXiv:2106.09685 (2021).
5. [5] Dettmers, Tim, Artidoro Pagnoni, Ari Holtzman, and Luke Zettlemoyer. "Qlora: Efficient finetuning of quantized llms." Advances in Neural Information Processing Systems 36 (2024).