1. Adversarial weight perturbation helps robust generalization;wu;Proc Adv Neural Inf Process Syst,0
2. Roberta: A robustly optimized bert pretraining approach;liu,2019
3. Adversarial training for free;shafahi;Proc 33rd Int Conf Neural Inf Process Syst,0
4. ALBERT: A. lite BERT for self-supervised learning of language representations;lan;Proc Int Conf Learn Representations,0
5. Adversarial machine learning at scale;kurakin;Proc Int Conf Learn Representations,0