1. Xiong W, Gupta A, Toshniwal S, Mehdad Y, Yih W-t (2022) Adapting pretrained text-to-text models for long text sequences. arXiv:2209.10052
2. Pang B, Nijkamp E, Kryściński W, Savarese S, Zhou Y, Xiong C (2023) Long document summarization with top–down and bottom-up inference. In: Findings of the association for computational linguistics: EACL 2023, pp 1237–1254
3. Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, Neelakantan A, Shyam P, Sastry G, Askell A et al (2020) Language models are few-shot learners. Adv Neural Inf Process Syst 33:1877–1901
4. OpenAI (2023) Gpt-4 technical report. arXiv:2303.08774
5. Touvron H, Lavril T, Izacard G, Martinet X, Lachaux M-A, Lacroix T, Rozière B, Goyal N, Hambro E, Azhar F, et al (2023) Llama: open and efficient foundation language models. arXiv:2302.13971