A Primer on <scp>Seq2Seq</scp> Models for Generative Chatbots-Reference-Cited by-同舟云学术

A Primer on Seq2Seq Models for Generative Chatbots

Published:2023-10-06 Issue:3 Volume:56 Page:1-58
ISSN:0360-0300
Container-title:ACM Computing Surveys
language:en
Short-container-title:ACM Comput. Surv.

Author:

Scotti Vincenzo¹^ORCID,Sbattella Licia¹^ORCID,Tedesco Roberto¹^ORCID

Affiliation:

1. DEIB, Politecnico di Milano, Italy

Abstract

The recent spread of Deep Learning-based solutions for Artificial Intelligence and the development of Large Language Models has pushed forwards significantly the Natural Language Processing area. The approach has quickly evolved in the last ten years, deeply affecting NLP, from low-level text pre-processing tasks –such as tokenisation or POS tagging– to high-level, complex NLP applications like machine translation and chatbots. This article examines recent trends in the development of open-domain data-driven generative chatbots, focusing on the Seq2Seq architectures. Such architectures are compatible with multiple learning approaches, ranging from supervised to reinforcement and, in the last years, allowed to realise very engaging open-domain chatbots. Not only do these architectures allow to directly output the next turn in a conversation but, to some extent, they also allow to control the style or content of the response. To offer a complete view on the subject, we examine possible architecture implementations as well as training and evaluation approaches. Additionally, we provide information about the openly available corpora to train and evaluate such models and about the current and past chatbot competitions. Finally, we present some insights on possible future directions, given the current research status.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science,Theoretical Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3604281

Reference222 articles.

1. Daniel Adiwardana Minh-Thang Luong David R. So Jamie Hall Noah Fiedel Romal Thoppilan Zi Yang Apoorv Kulshreshtha Gaurav Nemade Yifeng Lu and Quoc V. Le. 2020. Towards a human-like open-domain chatbot. arXiv:2001.09977. Retrieved from https://arxiv.org/abs/2001.09977.

2. Pattern Recognition. ICPR Int. Workshops and Challenges - Virtual Event, January 10–15, 2021, Proceedings, Part II.;Agnihotri Manish,2020

3. Mohammad Aliannejadi Julia Kiseleva Aleksandr Chuklin Jeff Dalton and Mikhail S. Burtsev. 2020. ConvAI3: Generating clarifying questions for open-domain dialogue systems (ClariQ). arXiv:2009.11352. Retrieved from https://arxiv.org/abs/2009.11352.

4. James Allen and Mark Core. 1997. Draft of DAMSL: Dialog act markup in several layers. https://www.cs.rochester.edu/research/cisd/resources/damsl/RevisedManual/.

5. Sanjeev Arora Yingyu Liang and Tengyu Ma. 2017. A simple but tough-to-beat baseline for sentence embeddings. In Proceedings of the 5th International Conference on Learning Representations. OpenReview.net.

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Chatbots for Mental Health: Leveraging LSTM and Seq2Seq Architectures to Enhance User Well-being;2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS);2024-06-28

2. An Intelligent Chatbot for Faculty Administration Using Bidirectional LSTM and Seq2Seq Architecture;2024 International Conference on Smart Computing, IoT and Machine Learning (SIML);2024-06-06

3. Evaluating the Experience of LGBTQ+ People Using Large Language Model Based Chatbots for Mental Health Support;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11