1. Attention is All you Need;Vaswani;Neural Information Processing Systems (NIPS),2017
2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding;Devlin;North American Chapter of the Association for Computational Linguistics (NAACL),2019
3. RoBERTa: A Robustly Optimized BERT Pretraining Approach;Liu,2019
4. ACE 2005 Multilingual Training Corpus;Walker;Linguistic Data Consortium,2006
5. MAVEN: A Massive General Domain Event Detection Dataset