1. Attention Is All You Need;vaswani;31st Conference on Neural Information Processing Systems (NIPS),2017
2. Learned in translation: contextualized word vectors;mccann;Advances in neural information processing systems,2017
3. Spanbert: improving pre-training by representing and predicting spans;joshi;ArXiv Preprint,2019
4. Bert: Pretraining of deep bidirectional transformers for language understanding;devlin;ArXiv Preprint,2018
5. Part of speech tagging: a systematic review of deep learning and machine learning approaches