1. Iz Beltagy, Matthew E. Peters, and Arman Cohan. 2020. Longformer: The Long-Document Transformer. CoRR abs/2004.05150 (2020). arXiv:2004.05150https://arxiv.org/abs/2004.05150
2. Aydar Bulatov, Yuri Kuratov, and Mikhail S. Burtsev. 2023. Scaling Transformer to 1M tokens and beyond with RMT. ArXiv abs/2304.11062 (2023). https://api.semanticscholar.org/CorpusID:258291566
3. Incorporating Fine-grained Events in Stock Movement Prediction
4. Yen-Chun Chen, Linjie Li, Licheng Yu, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, and Jingjing Liu. 2019. UNITER: Learning UNiversal Image-TExt Representations. CoRR abs/1909.11740 (2019). arXiv:1909.11740http://arxiv.org/abs/1909.11740
5. Rewon Child, Scott Gray, Alec Radford, and Ilya Sutskever. 2019. Generating Long Sequences with Sparse Transformers. CoRR abs/1904.10509 (2019). arXiv:1904.10509http://arxiv.org/abs/1904.10509