1. Dzmitry Bahdanau , Kyunghyun Cho , and Yoshua Bengio . 2015 . Neural Machine Translation by Jointly Learning to Align and Translate . In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409 .0473 Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2015. Neural Machine Translation by Jointly Learning to Align and Translate. In 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, Yoshua Bengio and Yann LeCun (Eds.). http://arxiv.org/abs/1409.0473
2. Dualformer: A Unified Bidirectional Sequence-to-Sequence Learning
3. Stéphane Clinchant Kweon Woo Jung and Vassilina Nikoulina. 2019. On the use of BERT for Neural Machine Translation. CoRR abs/1909.12744(2019). arXiv:1909.12744http://arxiv.org/abs/1909.12744 Stéphane Clinchant Kweon Woo Jung and Vassilina Nikoulina. 2019. On the use of BERT for Neural Machine Translation. CoRR abs/1909.12744(2019). arXiv:1909.12744http://arxiv.org/abs/1909.12744
4. Alexis Conneau Douwe Kiela Holger Schwenk Loïc Barrault and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. CoRR abs/1705.02364(2017). arXiv:1705.02364http://arxiv.org/abs/1705.02364 Alexis Conneau Douwe Kiela Holger Schwenk Loïc Barrault and Antoine Bordes. 2017. Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. CoRR abs/1705.02364(2017). arXiv:1705.02364http://arxiv.org/abs/1705.02364
5. Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly Jakob Uszkoreit and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. CoRR abs/2010.11929(2020). arXiv:2010.11929https://arxiv.org/abs/2010.11929 Alexey Dosovitskiy Lucas Beyer Alexander Kolesnikov Dirk Weissenborn Xiaohua Zhai Thomas Unterthiner Mostafa Dehghani Matthias Minderer Georg Heigold Sylvain Gelly Jakob Uszkoreit and Neil Houlsby. 2020. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. CoRR abs/2010.11929(2020). arXiv:2010.11929https://arxiv.org/abs/2010.11929