Affiliation:
1. School of Cyber Security and Computer, Hebei University, Baoding 071002, China
Abstract
Source code clone detection, which can identify code fragments with similar functions, plays a significant role in software development and quality assurance. Existing methods either extract single syntactic or semantic information, or ignore the associated information between code statements in different structures. It is difficult for these methods to effectively detect clone pairs with similar functions. In this paper, we propose a new model based on a dual graph convolutional network (GCN) and interval-valued hesitant fuzzy set (IVHFS), which we named DG-IVHFS. Specifically, we simplified and grouped the abstract syntax tree (AST) of source code to obtain the group representations. The group representations of the AST, as well as the control flow graph (CFG) representations, were transformed into graph structures, and then we applied GCNs on them to learn dependencies between nodes. In addition, we introduced IVHFS into the model for a more comprehensive evaluation of similarity. Our experimental results demonstrated that the precision, recall, and F1-scores of DG-IVHFS on the BigCloneBench and GoogleCodeJam datasets reached 98, 97 and 97% and 98, 93 and 95%, respectively, exceeding current state-of-the-art models. Moreover, our model performed well in terms of time consumption.
Funder
Natural Science Foundation of Hebei Province
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference44 articles.
1. A survey on software clone detection research;Roy;Queens Sch. Comput. TR,2007
2. Göde, N., and Koschke, R. (2009, January 24–27). Incremental clone detection. Proceedings of the 13th European Conference on Software Maintenance and Reengineering, Kaiserslautern, Germany.
3. CCFinder: A multilinguistic token-based code clone detection system for large scale source code;Kamiya;IEEE Trans. Softw. Eng.,2002
4. Wang, P., Svajlenko, J., Wu, Y., Xu, Y., and Roy, C.K. (June, January 27). CCAligner: A token based large-gap clone detector. Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden.
5. Li, L., Feng, H., Zhuang, W., Meng, N., and Ryder, B. (2017, January 17–22). Cclearner: A deep learning-based clone detection approach. Proceedings of the 2017 IEEE International Conference on Software Maintenance and Evolution (ICSME), Shanghai, China.