1. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems. NIPS’17. Curran Associates Inc., Red Hook, pp 6000–6010
2. Kitaev N, Kaiser L, Levskaya A (2020) Reformer: the efficient transformer. In: 8th international conference on learning representations, Addis Ababa, Ethiopia
3. Li S, Jin X, Xuan Y, Zhou X, Chen W, Wang Y-X, Yan X (2019) Enhancing the locality and breaking the memory bottleneck of transformer on time series forecasting. Curran Associates Inc., Red Hook
4. Wang S, Li BZ, Khabsa M, Fang H, Ma H (2020) Linformer: self-attention with linear complexity. arXiv preprint arXiv:2006.04768
5. Madhusudhanan K, Burchert J, Duong-Trung N, Born S, Schmidt-Thieme L (2023) U-net inspired transformer architecture for far horizon time series forecasting. Machine learning and knowledge discovery in databases. Springer, Cham, pp 36–52