1. SemEval-2017 Task 1: Semantic textual similarity multilingual and crosslingual focused evaluation;Cer,2017
2. An analysis of single-layer networks in unsupervised feature learning;Coates,2011
3. BERT: Pre-training of deep bidirectional transformers for language understanding;Devlin,2019
4. How contextual are contextualized word representations? Comparing the geometry of BERT, ELMo, and GPT-2 embeddings;Ethayarajh,2019
5. FRAGE: Frequency-agnostic word representation;Gong,2018