1. Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
2. Vatt: Transformers for multimodal self-supervised learning from raw video, audio and text;Akbari Hassan;Advances in Neural Information Processing Systems,2021
3. Jie Cao and Jing Xiao. 2022. An Augmented Benchmark Dataset for Geometric Question Answering through Dual Parallel Text Encoding. In Proceedings of the 29th International Conference on Computational Linguistics Nicoletta Calzolari Chu-Ren Huang Hansaem Kim James Pustejovsky Leo Wanner Key-Sun Choi Pum-Mo Ryu Hsin-Hsi Chen Lucia Donatelli Heng Ji Sadao Kurohashi Patrizia Paggio Nianwen Xue Seokhwan Kim Younggyun Hahm Zhong He Tony Kyungil Lee Enrico Santus Francis Bond and Seung-Hoon Na (Eds.). International Committee on Computational Linguistics Gyeongju Republic of Korea 1511–1520. https://aclanthology.org/2022.coling-1.130
4. GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning
5. Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, 2021. Training verifiers to solve math word problems. arXiv preprint arXiv:2110.14168 (2021).