1. BERT: pre-training of deep bidirectional transformers for language understanding;devlin;NAACL-HLT,0
2. Xlnet: Generalized autoregressive pretraining for language understanding;yang;NeurIPS,0
3. Deep convolutional networks for quality assessment of protein folds
4. PointConv: Deep Convolutional Networks on 3D Point Clouds
5. An image is worth 16x16 words: Transformers for image recognition at scale;dosovitskiy;ICLRE,0