1. Zhou P, Qi Z, Zheng S, Xu J, Bao H, Xu B Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. In: Proceedings of COLING 2016, the 26th international conference on computational linguistics: technical papers, Osaka, Japan, Dec 2016. The COLING 2016 Organizing Committee, pp 3485–3495
2. Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D: Nonlinear Phenom 404:132306
3. Lee-Thorp J, Ainslie J, Eckstein I, Ontanon S (2021) Fnet: mixing tokens with Fourier transforms
4. Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232
5. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser L, Polosukhin I (2017) Attention is all you need