1. Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv CSUR 51(5):1–36
2. Devlin J, Chang MW, Lee K, Toutanova K (2018) BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805
3. Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I (2019) Language models are unsupervised multitask learners. OpenAI Blog 1(8):9
4. Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov RR, Le QV (2019) XLNET: generalized autoregressive pretraining for language understanding. In: Advances in neural information processing systems 32 (NeurIPS)
5. Antoun W, Baly F, Hajj H (2020) AraBERT: transformer-based model for Arabic language understanding. arXiv preprint arXiv:2003.00104