Evaluation of effectiveness in conversations between humans and chatbots using parallel convolutional neural networks with multiple temporal resolutions

Author:

Escobar-Grisales Daniel,Vásquez-Correa Juan CamiloORCID,Orozco-Arroyave Juan Rafael

Abstract

AbstractChatbots enable the automation of several components in customer service and allow the support of multiple users. Despite their multiple advantages, due to the large amount of conversations generated by a chatbot, it is difficult to determine whether customer requests are well-addressed. For practical reasons, chatbot’s effectiveness is evaluated manually based upon a small sample (randomly chosen) of conversations or through self-reported user satisfaction. This procedure does not guarantee the correct evaluation of the service because the sample is generally not large enough and self-reports might be influenced by different external factors not directly associated to the chatbot’s functioning. This study proposes a methodology for automatic evaluation of chatbot effectiveness in real production environments. The analysis considers convolutional neural networks adapted for natural language processing, using two parallel convolutional layers to evaluate questions and answers independently. The proposed model also incorporates filters to extract features with multiple temporal resolution. This methodology is tested upon real conversations of chatbots that provide service to two different companies. The results are compared to baseline models based on classical techniques with different pre-trained word embedding models. According to our results, the proposed approach provides accuracies between 78.95% and 80.18%, which outperforms the best result of the baseline models by 2.9%.

Funder

H2020 Marie Skłodowska-Curie Actions

Universidad de Antioquia

Pratech

University of Antioquia

Publisher

Springer Science and Business Media LLC

Subject

Computer Networks and Communications,Hardware and Architecture,Media Technology,Software

Reference71 articles.

1. Aksu H (2013) Customer Service: the new proactive marketing. https://www.huffpost.com/entry/customer-service-the-new-_b_2827889?guccounter=1. Accessed 2021 Feb 17

2. Bakarov A (2018) A survey of word embeddings evaluation methods. arXiv:1801.09536

3. Basak H et al (2022) A union of deep learning and swarm-based optimization for 3D human action recognition. Sci Reports 12(1):1–17

4. Cahn J (2017) Chatbot: architecture, design, & development. University of Pennsylvania School of Engineering and Applied Science Department of Computer and Information Science

5. Canete J et al (2020) Spanish pre-trained bert model and evaluation data. PML4DC at ICLR 2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3