Capture Salient Historical Information: A Fast and Accurate Non-autoregressive Model for Multi-turn Spoken Language Understanding-Reference-Cited by-同舟云学术

Capture Salient Historical Information: A Fast and Accurate Non-autoregressive Model for Multi-turn Spoken Language Understanding

Published:2022-12-21 Issue:2 Volume:41 Page:1-32
ISSN:1046-8188
Container-title:ACM Transactions on Information Systems
language:en
Short-container-title:ACM Trans. Inf. Syst.

Author:

Cheng Lizhi¹^ORCID,Jia Weijia²^ORCID,Yang Wenmian³^ORCID

Affiliation:

1. Shanghai Jiao Tong University, Minhang District, Shanghai, PR China

2. BNU-UIC Institute of Artificial Intelligence and Future Networks, Beijing Normal University (Zhuhai), Guangdong Key Lab of AI and Multi-Modal Data Processing, BNU-HKBU United International College, Jintong Road, Tangjiawan, Zhuhai, Guangdong, PR China

3. Nanyang Technological University, Singapore City, Singapore

Abstract

Spoken Language Understanding (SLU), a core component of the task-oriented dialogue system, expects a shorter inference facing the impatience of human users. Existing work increases inference speed by designing non-autoregressive models for single-turn SLU tasks but fails to apply to multi-turn SLU in confronting the dialogue history. The intuitive idea is to concatenate all historical utterances and utilize the non-autoregressive models directly. However, this approach seriously misses the salient historical information and suffers from the uncoordinated-slot problems. To overcome those shortcomings, we propose a novel model for multi-turn SLU named Salient History Attention with Layer-Refined Transformer (SHA-LRT), which comprises a SHA module, a Layer-Refined Mechanism (LRM), and a Slot Label Generation (SLG) task. SHA captures salient historical information for the current dialogue from both historical utterances and results via a well-designed history-attention mechanism. LRM predicts preliminary SLU results from Transformer’s middle states and utilizes them to guide the final prediction, and SLG obtains the sequential dependency information for the non-autoregressive encoder. Experiments on public datasets indicate that our model significantly improves multi-turn SLU performance (17.5% on Overall) with accelerating (nearly 15 times) the inference process over the state-of-the-art baseline as well as effective on the single-turn SLU tasks.

Funder

Guangdong Key Lab of AI and Multi-modal Data Processing, United International College (UIC), Zhuhai

Chinese National Research Fund

Beijing Normal University (Zhuhai) Guangdong

Zhuhai Science-Tech Innovation Bureau

Publisher

Association for Computing Machinery (ACM)

Subject

Computer Science Applications,General Business, Management and Accounting,Information Systems

Link

https://dl.acm.org/doi/pdf/10.1145/3545800

Reference56 articles.

1. Layer normalization;Ba Jimmy Lei;arXiv preprint arXiv:1607.06450,2016

2. Memory Consolidation for Contextual Spoken Language Understanding with Dialogue Logistic Inference

3. Sequential Dialogue Context Modeling for Spoken Language Understanding

4. Towards Zero-Shot Frame Semantic Parsing for Domain Scaling

5. Learning end-to-end goal-oriented dialog;Bordes Antoine;arXiv preprint arXiv:1605.07683,2016

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Towards Spoken Language Understanding via Multi-level Multi-grained Contrastive Learning;Proceedings of the 32nd ACM International Conference on Information and Knowledge Management;2023-10-21

2. A Survey on Non-Autoregressive Generation for Neural Machine Translation and Beyond;IEEE Transactions on Pattern Analysis and Machine Intelligence;2023