Affiliation:
1. Institute of Software Systems Engineering Johannes Kepler University Linz Austria
Abstract
SummaryIn software development, developer turnover is among the primary reasons for project failures, leading to a great void of knowledge and strain for newcomers. Unfortunately, no established methods exist to measure how the problem domain knowledge is distributed among developers. Awareness of how this knowledge evolves and is owned by key developers in a project helps stakeholders reduce risks caused by turnover. To this end, this paper introduces a novel, realistic representation of problem domain knowledge distribution: the ConceptRealm. To construct the ConceptRealm, we employ a latent Dirichlet allocation model to represent textual features obtained from 300 K issues and 1.3 M comments from 518 open‐source projects. We analyze whether the newly emerged issues and developers share similar concepts or how aligned the individual developers' concepts are with the team over time. We also investigate the impact of leaving developers on the frequency of concepts. Finally, we also evaluate the soundness of our approach on a closed‐source software project, thus allowing the validation of the results from a practical standpoint. We find out that the ConceptRealm can represent the problem domain knowledge within a project and can be utilized to predict the alignment of developers with issues. We also observe that projects exhibit many keepers independent of project maturity and that abruptly leaving keepers correlates with a decline of their core concepts as the remaining developers cannot quickly familiarize themselves with those concepts.