Uncertainty-aware Automatic Evaluation Method for Open-domain Dialogue Systems-Reference-Cited by-同舟云学术

Uncertainty-aware Automatic Evaluation Method for Open-domain Dialogue Systems

Published:2023 Issue:2 Volume:30 Page:531-556
ISSN:1340-7619
Container-title:Journal of Natural Language Processing
language:en
Short-container-title:Journal of Natural Language Processing

Author:

Tsuta Yuma¹,Yoshinaga Naoki²,Toyoda Masashi²

Affiliation:

1. Graduate School of Information Science and Technology, The University of Tokyo

2. Institute of Industrial Science, The University of Tokyo

Publisher

Association for Natural Language Processing

Subject

General Earth and Planetary Sciences,General Environmental Science

Link

https://www.jstage.jst.go.jp/article/jnlp/30/2/30_531/_pdf

Reference37 articles.

1. Adiwardana, D., Luong, M.-T., So, D. R., Hall, J., Fiedel, N., Thoppilan, R., Yang, Z., Kulshreshtha, A., Nemade, G., Lu, Y., and Le, Q. V. (2020). “Towards a Human-like Open-Domain Chatbot.” arXiv preprint arXiv:2001.09977.

2. Banerjee, S. and Lavie, A. (2005). “METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments.” In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, pp. 65–72, Ann Arbor, Michigan. Association for Computational Linguistics.

3. Cho, K., van Merriënboer, B., Bahdanau, D., and Bengio, Y. (2014). “On the Properties of Neural Machine Translation: Encoder–Decoder Approaches.” In Proceedings of SSST-8, Eighth Workshop on Syntax, Semantics and Structure in Statistical Translation, pp. 103–111, Doha, Qatar. Association for Computational Linguistics.

4. Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K. (2019). “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding.” In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.

5. Galley, M., Brockett, C., Sordoni, A., Ji, Y., Auli, M., Quirk, C., Mitchell, M., Gao, J., and Dolan, B. (2015). “deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets.” In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), pp. 445–450, Beijing, China. Association for Computational Linguistics.