1. Huggingface: The AI community building the future. https://huggingface.co/
2. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2016)
3. Chung, J., Gulcehre, C., Cho, K.H., Bengio, Y.: Empirical evaluation of gated recurrent neural networks on sequence modeling (2014)
4. Devlin, J., Chang, M.-W., Lee, K., Toutanova, K.: Pre-training of deep bidirectional transformers for language understanding, BERT (2019)
5. Dosovitskiy, A., et al.: An image is worth $$16\times 16$$ words: transformers for image recognition at scale (2021)