1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473
2. Bao Y-X, Shi Q, Shen Q-Q, Cao Y (2022) Spatial-temporal 3D residual correlation network for urban traffic status prediction. Symmetry 14(1):33
3. Bengio Y (2009) Learning deep architectures for AI. Found Trends® Mach Learn 2(1):1–127
4. Bengio Y, Mesnil G, Dauphin Y, Rifai S (2013, February) Better mixing via deep representations. In: International conference on machine learning, USA, pp 552–560
5. Chevalier G (2018) LARNN: linear attention recurrent neural network. arXiv preprint arXiv:1808.05578