Affiliation:
1. Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, School of Software, Dalian University of Technology, Dalian, P. R. China
2. Faculty of Information Technology, Monash University, Australia
3. College of Computer Science and Technology, Harbin Engineering University, Harbin, China
Abstract
During software maintenance, bug reports are widely employed to improve the software project’s quality. A developer often refers to stowed bug reports in a repository for bug resolution. However, this reference process often requires a developer to pursue a substantial amount of textual information in bug reports which is lengthy and tedious. Automatic summarization of bug reports is one way to overcome this problem. Both supervised and unsupervised methods are effectively proposed for the automatic summary generation of bug reports. However, existing methods disregard the significance of duplicate bug reports in summarizing bug reports. In this study, we propose a PageRank-based Summarization Technique (PRST), which utilizes the textual information contained in bug reports and additional information in associated duplicate bug reports. PRST uses three variants of PageRank-based on Vector Space Model (VSM), Jaccard, and WordNet similarity metrics. These variants are utilized to calculate the textual similarity of the sentences between the master bug reports and their duplicates. PRST further trains a regression model and predicts the probability of sentences belonging to the summary. Finally, we combine the values of PageRank and regression model scores to rank the sentences and produce the summary for the master bug reports. In addition, we construct two corpora of bug reports and duplicates, i.e. MBRC and OSCAR. Empirical results suggest that PRST outperforms the state-of-the-art method BRC in terms of Precision, Recall, F-score, and Pyramid Precision. Meanwhile, PRST with WordNet achieves the best results against PRST with VSM and Jaccard.
Publisher
World Scientific Pub Co Pte Lt
Subject
Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software
Cited by
28 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献