1. Abdelali, A., Hassan, S., & Mubarak, H. (2021). Pre-Training BERT on Arabic Tweets: Practical Considerations. Qatar Computing Research Institute. Doha 5825, Qatar: arXiv.
2. Alsentzer, E., Murphy, J., Boag, W., Weng, W.-H., Jindi, D., Naumann, T., & McDermott, M. (2019). Publicly Available Clinical BERT Embeddings. (pp. 72-78). Proceedings of the 2nd Clinical Natural Language Processing Workshop.
3. Canete, J., Chaperon, G., & Fuentes, R. (2019). Spanish pre-trained bert model and evaluation data. PML4DC at ICLR.
4. Cui, Y., Che, W., Liu, T., Qin, B., Yang, Z., Wang, S., & Hu, G. (2019). Pre-Training with Whole Word Masking for Chinese BERT.
5. Dai, A., & Le, Q. (2015). Semi-supervised sequence learning”, In Advances in neural information processing systems., (pp. 3079–3087).