1. Guyon, I., Luxburg, U.V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (2017). Advances in Neural Information Processing Systems, Curran Associates, Inc.
2. Kaplan, J., McCandlish, S., Henighan, T., Brown, T.B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., and Amodei, D. (2020). Scaling Laws for Neural Language Models. arXiv.
3. Larochelle, H., Erhan, D., and Bengio, Y. (2008, January 13–17). Zero-Data Learning of New Tasks. Proceedings of the 23rd National Conference on Artificial Intelligence, Chicago, IL, USA.
4. Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C.L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., and Ray, A. (2022). Training language models to follow instructions with human feedback. arXiv.
5. Zhu, W., Liu, H., Dong, Q., Xu, J., Huang, S., Kong, L., Chen, J., and Li, L. (2023). Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis. arXiv.