1. Yi Chen Rui Wang Haiyun Jiang Shuming Shi and Ruifeng Xu. 2023. Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study. arxiv:2304.00723 [cs.CL] Yi Chen Rui Wang Haiyun Jiang Shuming Shi and Ruifeng Xu. 2023. Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study. arxiv:2304.00723 [cs.CL]
2. Survey on evaluation methods for dialogue systems
3. Sarik Ghazarian , Ralph Weischedel , Aram Galstyan , and Nanyun Peng . 2020 . Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems. arxiv:1911.01456 [cs.CL] Sarik Ghazarian, Ralph Weischedel, Aram Galstyan, and Nanyun Peng. 2020. Predictive Engagement: An Efficient Metric For Automatic Evaluation of Open-Domain Dialogue Systems. arxiv:1911.01456 [cs.CL]
4. Weakly Supervised Turn-level Engagingness Evaluator for Dialogues
5. Longxuan Ma , Ziyu Zhuang , Weinan Zhang , Mingda Li , and Ting Liu . 2022 . SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation . In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 485–495 . https://aclanthology.org/2022.coling-1.39 Longxuan Ma, Ziyu Zhuang, Weinan Zhang, Mingda Li, and Ting Liu. 2022. SelF-Eval: Self-supervised Fine-grained Dialogue Evaluation. In Proceedings of the 29th International Conference on Computational Linguistics. International Committee on Computational Linguistics, Gyeongju, Republic of Korea, 485–495. https://aclanthology.org/2022.coling-1.39