1. Attention is all you need;Vaswani;NeurIPS,2017
2. Imagenet classification with deep convolutional neural networks;Krizhevsky;Commun. ACM,2017
3. Deep residual learning for image recognition;He,2016
4. Long short-term memory;Hochreiter;Neural Comput.,1997
5. Bert: Pre-training of deep bidirectional transformers for language understanding;Devlin,2018