Sentence embedding and fine-tuning to automatically identify duplicate bugs
-
Published:2023-01-19
Issue:
Volume:4
Page:
-
ISSN:2624-9898
-
Container-title:Frontiers in Computer Science
-
language:
-
Short-container-title:Front. Comput. Sci.
Author:
Isotani Haruna,Washizaki Hironori,Fukazawa Yoshiaki,Nomoto Tsutomu,Ouji Saori,Saito Shinobu
Abstract
Industrial software maintenance is critical but burdensome. Activities such as detecting duplicate bug reports are often performed manually. Herein an automated duplicate bug report detection system improves maintenance efficiency using vectorization of the contents and deep learning–based sentence embedding to calculate the similarity of the whole report from vectors of individual elements. Specifically, sentence embedding is realized using Sentence-BERT fine tuning. Additionally, its performance is experimentally compared to baseline methods to validate the proposed system. The proposed system detects duplicate bug reports more effectively than existing methods.
Publisher
Frontiers Media SA
Subject
Computer Science Applications,Computer Vision and Pattern Recognition,Human-Computer Interaction,Computer Science (miscellaneous)
Reference40 articles.
1. Natural language understanding for argumentative dialogue systems in the opinion building domain;Abro;Knowl. Based Syst,2022
2. Multi-turn intent determination and slot filling with neural networks and regular expressions;Abro;Knowl. Based Syst,2020
3. Fast detection of duplicate bug reports using lda-based topic modeling and classification,;Akilan,2020
4. Using BERT to predict bug-fixing time,;Ardimento,2020
5. A simple but tough-to-beat baseline for sentence embeddings,;Arora,2017