1. Bespalov, D., Bhabesh, S., Xiang, Y., Zhou, L., Qi, Y.: Towards building a robust toxicity predictor. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 5: Industry Track), pp. 581–598 (2023)
2. Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators (2020)
3. Cui, Y., Che, W., Liu, T., Qin, B., Wang, S., Hu, G.: Revisiting pretrained models for chinese natural language processing. In: Findings (2020). https://api.semanticscholar.org/CorpusID:216641856
4. Deng, C.: Pk-chat: pointer network guided knowledge driven generative dialogue model. ArXiv abs/2304.00592 (2023). https://api.semanticscholar.org/CorpusID:257913448
5. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. In: North American Chapter of the Association for Computational Linguistics (2019). https://api.semanticscholar.org/CorpusID:52967399