Abstract
AbstractIn software, an algorithm is a well-organized sequence of actions that provides the optimal way to complete a task. Algorithmic thinking is also essential to break-down a problem and conceptualize solutions in some steps. The proper selection of an algorithm is pivotal to improve computational performance and software productivity as well as to programming learning. That is, determining a suitable algorithm from a given code is widely relevant in software engineering and programming education. However, both humans and machines find it difficult to identify algorithms from code without any meta-information. This study aims to propose a program code classification model that uses a convolutional neural network (CNN) to classify codes based on the algorithm. First, program codes are transformed into a sequence of structural features (SFs). Second, SFs are transformed into a one-hot binary matrix using several procedures. Third, different structures and hyperparameters of the CNN model are fine-tuned to identify the best model for the code classification task. To do so, 61,614 real-world program codes of different types of algorithms collected from an online judge system are used to train, validate, and evaluate the model. Finally, the experimental results show that the proposed model can identify algorithms and classify program codes with a high percentage of accuracy. The average precision, recall, and F-measure scores of the best CNN model are 95.65%, 95.85%, and 95.70%, respectively, indicating that it outperforms other baseline models.
Funder
Japan Society for the Promotion of Science (JSPS) KAKENHI
Publisher
Springer Science and Business Media LLC
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献