A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning-Reference-Cited by-同舟云学术

A Survey on Recent Advances and Challenges in Reinforcement Learning Methods for Task-oriented Dialogue Policy Learning

Published:2023-01-07 Issue:3 Volume:20 Page:318-334
ISSN:2731-538X
Container-title:Machine Intelligence Research
language:en
Short-container-title:Mach. Intell. Res.

Author:

Kwan Wai-Chung^ORCID,Wang Hong-Ru^ORCID,Wang Hui-Min^ORCID,Wong Kam-Fai^ORCID

Abstract

AbstractDialogue policy learning (DPL) is a key component in a task-oriented dialogue (TOD) system. Its goal is to decide the next action of the dialogue system, given the dialogue state at each turn based on a learned dialogue policy. Reinforcement learning (RL) is widely used to optimize this dialogue policy. In the learning process, the user is regarded as the environment and the system as the agent. In this paper, we present an overview of the recent advances and challenges in dialogue policy from the perspective of RL. More specifically, we identify the problems and summarize corresponding solutions for RL-based dialogue policy learning. In addition, we provide a comprehensive survey of applying RL to DPL by categorizing recent methods into five basic elements in RL. We believe this survey can shed light on future research in DPL.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Artificial Intelligence,Computer Networks and Communications,Computer Science Applications,Computer Vision and Pattern Recognition,Modeling and Simulation,Signal Processing,Control and Systems Engineering

Link

https://link.springer.com/content/pdf/10.1007/s11633-022-1347-y.pdf

Reference118 articles.

1. H. S. Chen, X. R. Liu, D. W. Yin, J. J. Tang. A survey on dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explorations Newsletter, vol. 19, no. 2, pp. 25–35, 2017. DOI: https://doi.org/10.1145/3166054.3166058.

2. M. Lewis, D. Yarats, Y. Dauphin, D. Parikh, D. Batra. Deal or no deal? End-to-end learning of negotiation dialogues. In Proceedings of Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 2443–2453, 2017. DOI: https://doi.org/10.18653/v1/D17-1259.

3. M. Eric, C. Manning. A copy-augmented sequence-to-sequence architecture gives good performance on task-oriented dialogue. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, pp. 468–473, 2017.

4. T. C. Chi, P. C. Chen, S. Y. Su, Y. N. Chen. Speaker role contextual modeling for language understanding and dialogue policy learning. In Proceedings of the 8th International Joint Conference on Natural Language Processing, Taipei, China, pp. 163–168, 2017.

5. K. Wang, J. F. Tian, R. Wang, X. J. Quan, J. X. Yu. Multi-domain dialogue acts and response co-generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7125–7134, 2020. DOI: https://doi.org/10.18653/v1/2020.acl-main.638.

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Prompting large language models for user simulation in task-oriented dialogue systems;Computer Speech & Language;2025-01

2. Task-based dialogue policy learning based on diffusion models;Applied Intelligence;2024-09-02

3. Towards a Formal Characterization of User Simulation Objectives in Conversational Information Access;Proceedings of the 2024 ACM SIGIR International Conference on Theory of Information Retrieval;2024-08-02

4. KddRES: A Multi-level Knowledge-driven Dialogue Dataset for Restaurant Towards Customized Dialogue System;Computer Speech & Language;2024-08

5. Learning Top-K Subtask Planning Tree Based on Discriminative Representation Pretraining for Decision-making;Machine Intelligence Research;2024-05-29