1. Frieder S, Pinchetti L, Griffiths R-R, Salvatori T, Lukasiewicz T, Petersen PC, Chevalier A, Berner J (2023) Mathematical capabilities of chatgpt. arXiv preprint arXiv:2301.13867
2. Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi EH, Le QV, Zhou D, et al (2022) Chain-of-thought prompting elicits reasoning in large language models. In: Advances in neural information processing systems
3. Kojima T, Gu SS, Reid M, Matsuo Y, Iwasawa Y (2022) Large language models are zero-shot reasoners. In: Advances in neural information processing systems
4. Qiao S, Ou Y, Zhang N, Chen X, Yao Y, Deng S, Tan C, Huang F, Chen H (2023) Reasoning with language model prompting: A survey. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 5368–5393. Association for Computational Linguistics, Toronto, Canada. https://doi.org/10.18653/v1/2023.acl-long.294. https://aclanthology.org/2023.acl-long.294
5. Cobbe K, Kosaraju V, Bavarian M, Chen M, Jun H, Kaiser L, Plappert M, Tworek J, Hilton J, Nakano R et al (2021) Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168