From Bag-of-Words to Pre-trained Neural Language Models: Improving Automatic Classification of App Reviews for Requirements Engineering

Author:

Araujo Adailton,Golo Marcos,Viana Breno,Sanches Felipe,Romero Roseli,Marcacini Ricardo

Abstract

Popular mobile applications receive millions of user reviews. Thesereviews contain relevant information, such as problem reports and improvementsuggestions. The reviews information is a valuable knowledge source for soft-ware requirements engineering since the analysis of the reviews feedback helpsto make strategic decisions in order to improve the app quality. However, due tothe large volume of texts, the manual extraction of the relevant information is animpracticable task. In this paper, we investigate and compare textual represen-tation models for app reviews classification. We discuss different aspects andapproaches for the reviews representation, analyzing from the classic Bag-of-Words models to the most recent state-of-the-art Pre-trained Neural Languagemodels. Our findings show that the classic Bag-of-Words model, combined witha careful analysis of text pre-processing techniques, is still a competitive model.However, pre-trained neural language models showed to be more advantageoussince it obtains good classification performance, provides significant dimension-ality reduction, and deals more adequately with semantic proximity between thereviews’ texts, especially the multilingual neural language models.

Publisher

Sociedade Brasileira de Computação - SBC

Cited by 15 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Interpretable App Review Classification with Transformers;2024 IEEE 32nd International Requirements Engineering Conference Workshops (REW);2024-06-24

2. Classification of User Reviews for Software Maintenance in Indonesian Language Using IndoBERT-BiLSTM (Case Study: MyPertamina);2023 Eighth International Conference on Informatics and Computing (ICIC);2023-12-08

3. Twitter Sentiment Geographical Index Dataset;Scientific Data;2023-10-09

4. MNoR-BERT: multi-label classification of non-functional requirements using BERT;Neural Computing and Applications;2023-08-12

5. Evaluating pre-trained models for user feedback analysis in software engineering: a study on classification of app-reviews;Empirical Software Engineering;2023-05-23

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3