Comparative Analysis of Existing and a Novel Approach to Topic Detection on Conversational Dialogue Data-Reference-Cited by-同舟云学术

Comparative Analysis of Existing and a Novel Approach to Topic Detection on Conversational Dialogue Data

Published:2022-08-31 Issue:4 Volume:11 Page:1-18
ISSN:2319-4111
Container-title:International Journal on Natural Language Computing
language:
Short-container-title:IJNLC

Author:

Khalid Haider,Wade Vincent

Abstract

Topic detection in dialogue datasets has become a significant challenge for unsupervised and unlabeled data to develop a cohesive and engaging dialogue system. In this paper, we proposed unsupervised and semi-supervised techniques for topic detection in the conversational dialogue dataset and compared them with existing topic detection techniques. The paper proposes a novel approach for topic detection, which takes preprocessed data as an input and performs similarity analysis with the TF-IDF scores bag of words technique (BOW) to identify higher frequency words from dialogue utterances. It then refines the higher frequency words by integrating the clustering and elbow methods and using the Parallel Latent Dirichlet Allocation (PLDA) model to detect the topics. The paper comprised a comparative analysis of the proposed approach on the Switchboard, Personachat and MultiWOZ dataset. The experimental results show that the proposed topic detection approach performs significantly better using a semi-supervised dialogue dataset. We also performed topic quantification to check how accurate extracted topics are to compare with manually annotated data. For example, extracted topics from Switchboard are 92.72%, Peronachat 87.31% and MultiWOZ 93.15% accurate with manually annotated data.

Publisher

Academy and Industry Research Collaboration Center (AIRCC)

Subject

General Medicine

Reference37 articles.

1. [1] Martin Porcheron, Joel E Fischer, Stuart Reeves, and Sarah Sharples. Voice interfaces in everyday life. In proceedings of the 2018 CHI conference on human factors in computing systems, pages 1-12, 2018.

2. [2] Ryo Ishii, Taichi Katayama, Ryuichiro Higashinaka, and Junji Tomita. Generating body motions using spoken language in dialogue. In Proceedings of the 18th International Conference on Intelligent Virtual Agents, pages 87-92, 2018.

3. [3] Edin Sabi'c, Daniel Henning, Hunter My¨uz, Audrey Morrow, Michael C Hout, and ˇ Justin A MacDonald. Examining the role of eye movements during conversational listening in noise. Frontiers in psychology, 11:200, 2020.

4. [4] Amon Rapp, Lorenzo Curti, and Arianna Boldi. The human side of human-chatbot interaction: A systematic literature review of ten years of research on text-based chatbots. International Journal of Human-Computer Studies, page 102630, 2021.

5. [5] Ryan Lowe, Nissan Pow, Iulian Serban, and Joelle Pineau. The ubuntu dialogue corpus: A large dataset for research in unstructured multi-turn dialogue systems. arXiv preprint arXiv:1506.08909, 2015.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Optimizing Neural Topic Modeling Pipelines for Low-Quality Speech Transcriptions;Lecture Notes in Computer Science;2024