1. Language models are few-shot learners;Brown;Adv. Neural Inf. Process. Syst.,2020
2. Hoglund, S., and Khedri, J. (2024, May 01). Comparison Between RLHF and RLAIF in Fine-Tuning a Large Language Model. Available online: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-331926.
3. Chain-of-thought prompting elicits reasoning in large language models;Wei;Adv. Neural Inf. Process. Syst.,2022
4. Creswell, A., Shanahan, M., and Higgins, I. (2022). Selection-inference: Exploiting large language models for interpretable logical reasoning. arXiv.
5. Meta Fundamental AI Research Diplomacy Team (FAIR), Bakhtin, A., Brown, N., Dinan, E., Farina, G., Flaherty, C., Fried, D., Goff, A., Gray, J., and Hu, H. (2022). Human-level play in the game of diplomacy by combining language models with strategic reasoning. Science, 378, 1067–1074.