1. Devlin, J., Chang, M.W., Lee, K. and Toutanova, K. (2019) BERT: Pretraining of Deep Bidirectional Transformers for Language Understanding. 2019 Conference of the North American Chapter of the Association for Computational Lin-guistics, Vol. 1, 4171-4186.
2. 刘欢, 张智雄, 王宇飞. BERT模型的主要优化改进方法研究综述[J]. 数据分析与知识发现, 2021, 5(1): 3-15.
3. 邵仁荣, 刘宇昂, 张伟, 等. 深度学习中知识蒸馏研究综述[J]. 计算机学报, 2022, 45(8): 1638-1673.
4. Hinton, G., Vinyals, O. and Dean, J. (2015) Distilling the Knowledge in a Neural Net-work. 1-9.
http://arxiv.org/abs/1503.02531
5. Weibo Text Sentiment Analysis Based on BERT and Deep Learning