Affiliation:
1. School of Computer Science and Engineering Beihang University Beijing China
Abstract
AbstractContextThe UML class diagram is commonly used to model functional structures and software code structures in both the preliminary and detailed design stages. And the abstraction level of UML class diagrams is usually higher than that of source code. Usually, there is a lack of trace links between these class diagrams and the source code, which may cause difficulties in understanding the source code, and affect the software evolution and maintenance.ObjectiveThe main goal of this article is to establish the trace links between highly abstracted UML class diagrams in the design phase and source code, and eventually help practitioners better understand source code.MethodWe propose an approach for the automated trace link establishment between UML class diagrams in the design phase and source code. To address the problem of abstraction level gap between them, we extend the UML class diagram by mining the synonymous phrases of class names and deducing the latent missing relationships between classes from multiple design documents. Then we build the trace links with a two‐phase approach including initial construction with fuzzy matching and further optimization by class relationship inference.ResultsExperiments on five open‐source projects show that the recalls of our approach are over 94%, and the F2‐scores are over 88%, with the gains of 30% to 60% than the four baselines.ConclusionOur work can be a reference for establishing the initial trace links between highly‐abstracted UML class diagrams and source code. Towards the higher abstraction of design diagrams, we extend UML class diagrams with the statistical analysis on multiple design documents. To guarantee the quality of trace links, we design a two‐phase approach by obtaining the “full but not good enough” trace links and filtering the “probably wrong” links. Experiments show that the main techniques of our approach behave as important role for tracing between high‐level class diagrams and source code.
Funder
National Natural Science Foundation of China