STOD: Towards Scalable Task-Oriented Dialogue System on MultiWOZ-API
-
Published:2024-06-19
Issue:12
Volume:14
Page:5303
-
ISSN:2076-3417
-
Container-title:Applied Sciences
-
language:en
-
Short-container-title:Applied Sciences
Author:
Lu Hengtong1, Yuan Caixia1, Wang Xiaojie1
Affiliation:
1. School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100083, China
Abstract
Task-oriented dialogue systems (TODs) enable users to complete specific goals and are widely used in practice. Although existing models have achieved delightful performance for single-domain dialogues, scalability to new domains is far from well explored. Traditional dialogue systems rely on domain-specific information like dialogue state and database (DB), which limits the scalability of such systems. In this paper, we propose a Scalable Task-Oriented Dialogue modeling framework (STOD). Instead of labeling multiple dialogue components, which have been adopted by previous work, we only predict structured API queries to interact with DB and generate responses based on the complete DB results. Further, we construct a new API-schema-based TOD dataset MultiWOZ-API with API query and DB result annotation based on MultiWOZ 2.1. We then propose MSTOD and CSTOD for multi-domain and cross-domain TOD systems, respectively. We perform extensive qualitative experiments to verify the effectiveness of our proposed framework. We find the following. (1) Scalability across multiple domains: MSTOD achieves 2% improvements than the previous state-of-the-art in the multi-domain TOD. (2) Scalability to new domains: our framework enables satisfying generalization capability to new domains, a significant margin of 10% to existing baselines.
Funder
National Natural Science Foundation of China
Reference34 articles.
1. Wen, T.-H., Gasic, M., Mrkšić, N., Su, P.-H., Vandyke, D., and Young, S. (2015, January 17–21). Semantically Conditioned LSTM-Based Natural Language Generation for Spoken Dialogue Systems. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, Lisbon, Portugal. 2. Eric, M., Goel, R., Paul, S., Sethi, A., Agarwal, S., Gao, S., Kumar, A., Goyal, A., Ku, P., and Hakkani-Tur, D. (2020, January 11–16). Multiwoz 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines. Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France. 3. Zhang, Z., Takanobu, R., Zhu, Q., Huang, M., and Zhu, X. (2020). Recent Advances and Challenges in Task-oriented Dialog System. arXiv. 4. He, K., Lei, S., Yang, Y., Jiang, H., and Wang, Z. (2020, January 8–13). Syntactic Graph Convolutional Network for Spoken Language Understanding. Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, Spain. 5. Qin, L., Che, W., Li, Y., Wen, H., and Liu, T. (2019, January 3–7). A Stack-Propagation Framework with Token-Level Intent Detection for Spoken Language Understanding. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China.
|
|