1. Jaime Carbonell and Jade Goldstein. 1998. The Use of MMR Diversity-Based Reranking for Reordering Documents and Producing Summaries. In SIGIR.
2. Kevin Clark, Minh-Thang Luong, Quoc V. Le, and Christopher D. Manning. 2020. ELECTRA : Pre-training Text Encoders as Discriminators Rather Than Generators. In ICLR. arxiv: 2003.10555 [cs]
3. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT : Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL-HLT.
4. Beyond accuracy
5. Sébastien Jean Kyunghyun Cho Roland Memisevic and Yoshua Bengio. 2015. On Using Very Large Target Vocabulary for Neural Machine Translation. In ACL-IJCNLP.