1. Electra: Pre-training text encoders as discriminators rather than generators;clark;ArXiv Preprint,2020
2. SpanBERT: Improving Pre-training by Representing and Predicting Spans
3. Unified language model pre-training for natural language understanding and generation;dong;Advances in neural information processing systems,2019
4. A structured self-attentive sentence embedding;lin;Proc Int Conf Learning Representations,0