1. Attention is all you need;Vaswani;Advances in neural information processing systems,2017
2. Improving language understanding by generative pre-training;Radford,2018
3. Deep residual learning for image recognition;He,2016
4. Character Recognition Techniques and approaches: a literature review;Mohammed;Mesopotamian J. Comput. Sci.,2021
5. Dialogpt: Large-scale generative pre-training for conversational response generation;Zhang,2019