Abstract
This paper describes how we tackled the development of Amaia, a conversational agent for Portuguese entrepreneurs. After introducing the domain corpus used as Amaia’s Knowledge Base (KB), we make an extensive comparison of approaches for automatically matching user requests with Frequently Asked Questions (FAQs) in the KB, covering Information Retrieval (IR), approaches based on static and contextual word embeddings, and a model of Semantic Textual Similarity (STS) trained for Portuguese, which achieved the best performance. We further describe how we decreased the model’s complexity and improved scalability, with minimal impact on performance. In the end, Amaia combines an IR library and an STS model with reduced features. Towards a more human-like behavior, Amaia can also answer out-of-domain questions, based on a second corpus integrated in the KB. Such interactions are identified with a text classifier, also described in the paper.
Reference40 articles.
1. Introduction to Information Retrieval;Manning,2008
2. Visão Geral da Avaliação de Similaridade Semântica e Inferência Textual;Fonseca;Linguamática,2016
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. How do ChatBots look like?;Proceedings of the 21st Brazilian Symposium on Human Factors in Computing Systems;2022-10-17
2. AIA-BDE: um Corpo de Perguntas, Variações e outras Anotações;Linguamática;2021-12-30