1. Bai Y, Jones A, Ndousse K, et al. (2022) Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv preprint arXiv:2204.05862.
2. GPT-NeoX-20B: An Open-Source Autoregressive Language Model
3. Chen M, Tworek J, Jun H, et al. (2021) Evaluating large language models trained on code. arXiv preprint arXiv:2107.03374.
4. Clark K, Luong MT, Le QV, et al. (2020) ELECTRA: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.