1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding;Devlin;Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019,2019
2. Peng, Y., Yan, S., and Lu, Z. (2019, January 1). Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. Proceedings of the 2019 Workshop on Biomedical Natural Language Processing (BioNLP 2019), Florence, Italy.
3. Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., and Stoyanov, V. (2019). RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv.
4. Tenney, I., Das, D., and Pavlick, E. (August, January 28). BERT Rediscovers the Classical NLP Pipeline. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy.
5. Miaschi, A., Brunato, D., Dell’Orletta, F., and Venturi, G. (2020, January 8–13). Linguistic Profiling of a Neural Language Model. Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020, Barcelona, Spain (Online).