Affiliation:
1. Indian Institute of Technology Ropar, Rupnagar, Punjab, India
Abstract
An important issue faced during software development is to identify defects and the properties of those defects, if found, in a given source file. Determining
defectiveness
of source code assumes significance due to its implications on software development and maintenance cost.
We present a novel system to estimate the presence of defects in source code and detect attributes of the possible defects, such as the severity of defects. The salient elements of our system are: (i) a dataset of newly introduced source code metrics, called
PRO
gramming
CON
struct (PROCON) metrics, and (ii) a novel
M
achine-
L
earning (ML)-based system, called
D
efect
E
stimator for
S
ource
Co
de (DESCo), that makes use of PROCON dataset for predicting defectiveness in a given scenario. The dataset was created by processing 30,400+ source files written in four popular programming languages, viz., C, C++, Java, and Python.
The results of our experiments show that DESCo system outperforms one of the state-of-the-art methods with an improvement of 44.9%. To verify the correctness of our system, we compared the performance of 12 different ML algorithms with 50+ different combinations of their key parameters. Our system achieves the best results with SVM technique with a mean accuracy measure of 80.8%.
Publisher
Association for Computing Machinery (ACM)
Cited by
12 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献