Machine Learning Techniques for Code Smells Detection: A Systematic Mapping Study-Reference-Cited by-同舟云学术

Machine Learning Techniques for Code Smells Detection: A Systematic Mapping Study

Published:2019-02 Issue:02 Volume:29 Page:285-316
ISSN:0218-1940
Container-title:International Journal of Software Engineering and Knowledge Engineering
language:en
Short-container-title:Int. J. Soft. Eng. Knowl. Eng.

Author:

Caram Frederico Luiz¹,Rodrigues Bruno Rafael De Oliveira¹,Campanelli Amadeu Silveira¹,Parreiras Fernando Silva¹

Affiliation:

1. LAIS Laboratory for Advanced Information Systems, FUMEC University, Av. Afonso Pena 3880, Belo Horizonte, MG, 30130009, Brazil

Abstract

Code smells or bad smells are an accepted approach to identify design flaws in the source code. Although it has been explored by researchers, the interpretation of programmers is rather subjective. One way to deal with this subjectivity is to use machine learning techniques. This paper provides the reader with an overview of machine learning techniques and code smells found in the literature, aiming at determining which methods and practices are used when applying machine learning for code smells identification and which machine learning techniques have been used for code smells identification. A mapping study was used to identify the techniques used for each smell. We found that the Bloaters was the main kind of smell studied, addressed by 35% of the papers. The most commonly used technique was Genetic Algorithms (GA), used by 22.22% of the papers. Regarding the smells addressed by each technique, there was a high level of redundancy, in a way that the smells are covered by a wide range of algorithms. Nevertheless, Feature Envy stood out, being targeted by 63% of the techniques. When it comes to performance, the best average was provided by Decision Tree, followed by Random Forest, Semi-supervised and Support Vector Machine Classifier techniques. 5 out of the 25 analyzed smells were not handled by any machine learning techniques. Most of them focus on several code smells and in general there is no outperforming technique, except for a few specific smells. We also found a lack of comparable results due to the heterogeneity of the data sources and of the provided results. We recommend the pursuit of further empirical studies to assess the performance of these techniques in a standardized dataset to improve the comparison reliability and replicability.

Publisher

World Scientific Pub Co Pte Lt

Subject

Artificial Intelligence,Computer Graphics and Computer-Aided Design,Computer Networks and Communications,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S021819401950013X

Reference41 articles.

1. Measurement of the maintenance process from a demand-based perspective

2. Support vector machines combined with feature selection for breast cancer diagnosis

3. Identifying refactoring opportunities in object-oriented code: A systematic literature review

4. Identifying Extract Class refactoring opportunities using structural and semantic cohesion measures

5. Mining static and dynamic crosscutting concerns: a role-based approach

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Prescriptive procedure for manual code smell annotation;Science of Computer Programming;2024-12

2. Machine Learning-Based Methods for Code Smell Detection: A Survey;Applied Sciences;2024-07-15

3. Application of Deep Learning for Code Smell Detection: Challenges and Opportunities;SN Computer Science;2024-06-03

4. Automatic detection of Feature Envy and Data Class code smells using machine learning;Expert Systems with Applications;2024-06

5. Quality Assessment of ChatGPT Generated Code and their Use by Developers;Proceedings of the 21st International Conference on Mining Software Repositories;2024-04-15