1. Attention is All You Need;vaswani;NeurIPS,2017
2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding;devlin;NAACL-HLT,2019
3. Language Models are Unsupervised Multitask Learners;radford;Tech Rep,2019
4. Deep Learning Model for Cancer Risk from Low Dose Medical Imaging Radiation;boursalie;ECR,2020