1. Antoine Bordes Nicolas Usunier Alberto García-Durán J. Weston and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS. Antoine Bordes Nicolas Usunier Alberto García-Durán J. Weston and Oksana Yakhnenko. 2013. Translating Embeddings for Modeling Multi-relational Data. In NIPS.
2. Andrew M. Dai and Quoc V . Le . 2015 . Semi-supervised Sequence Learning. In NIPS. Andrew M. Dai and Quoc V. Le. 2015. Semi-supervised Sequence Learning. In NIPS.
3. J. Devlin , Ming-Wei Chang , Kenton Lee , and Kristina Toutanova . 2019 . BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT. J. Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.
4. Kalpit Dixit and Yaser Al-Onaizan. 2019. Span-Level Model for Relation Extraction. In ACL. Kalpit Dixit and Yaser Al-Onaizan. 2019. Span-Level Model for Relation Extraction. In ACL.
5. William Fedus , Barret Zoph , and Noam M . Shazeer . 2021 . Switch Transformers : Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. ArXiv , Vol. abs/ 2101 .03961 (2021). William Fedus, Barret Zoph, and Noam M. Shazeer. 2021. Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity. ArXiv, Vol. abs/2101.03961 (2021).