Machine Learning-Based Methods for Code Smell Detection: A Survey-Reference-Cited by-同舟云学术

Machine Learning-Based Methods for Code Smell Detection: A Survey

Published:2024-07-15 Issue:14 Volume:14 Page:6149
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Yadav Pravin Singh¹^ORCID,Rao Rajwant Singh¹^ORCID,Mishra Alok²³^ORCID,Gupta Manjari⁴^ORCID

Affiliation:

1. Department of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur 495009, Chhattisgarh, India

2. Faculty of Engineering, Norwegian University of Science and Technology (NTNU), 7491 Trondheim, Norway

3. Informatics and Digitalization Group, Molde University College, Specialized University in Logistics, 6402 Molde, Norway

4. Computer Science, DST—Centre for Interdisciplinary Mathematical Sciences, Institute of Science, Banaras Hindu University, Varanasi 221005, Uttar Pradesh, India

Abstract

Code smells are early warning signs of potential issues in software quality. Various techniques are used in code smell detection, including the Bayesian approach, rule-based automatic antipattern detection, antipattern identification utilizing B-splines, Support Vector Machine direct, SMURF (Support Vector Machines for design smell detection using relevant feedback), and immune-based detection strategy. Machine learning (ML) has taken a great stride in this area. This study includes relevant studies applying ML algorithms from 2005 to 2024 in a comprehensive manner for the survey to provide insight regarding code smell, ML algorithms frequently applied, and software metrics. Forty-two pertinent studies allow us to assess the efficacy of ML algorithms on selected datasets. After evaluating various studies based on open-source and project datasets, this study evaluated additional threats and obstacles to code smell detection, such as the lack of standardized code smell definitions, the difficulty of feature selection, and the challenges of handling large-scale datasets. The current studies only considered a few factors in identifying code smells, while in this study, several potential contributing factors to code smells are included. Several ML algorithms are examined, and various approaches, datasets, dataset languages, and software metrics are presented. This study provides the potential of ML algorithms to produce better results and fills a gap in the body of knowledge by providing class-wise distributions of the ML algorithms. Support Vector Machine, J48, Naive Bayes, and Random Forest models are the most common for detecting code smells. Researchers can find this study helpful in better anticipating and taking care of software development design and implementation issues. The findings from this study, which highlight the practical implications of ML algorithms in software quality improvement, will help software engineers fix problems during software design and development to ensure software quality.

Publisher

MDPI AG

Link

https://www.mdpi.com/2076-3417/14/14/6149/pdf

Reference66 articles.

1. Dewangan, S., Rao, R.S., and Yadav, P.S. (2022, January 21–23). Dimensionally Reduction based Machine Learning Approaches for Code smells Detection. Proceedings of the 2022 International Conference on Intelligent Controller and Computing for Smart Power (ICICCSP), Hyderabad, India.

2. Yadav, P.S., Dewangan, S., and Rao, R.S. (2021, January 1–2). Extraction of Prediction Rules of Code Smell using Decision Tree Algorithm. Proceedings of the International Conference on Internet of Everything, Microwave Engineering, Communication and Networks (IEMECON), Jaipur, India.

3. Predicting Code Smells and Analysis of Predictions: Using Machine Learning Techniques and Software Metrics;Mhawish;J. Comput. Sci. Technol.,2020

4. Method-Level Code Smells Detection Using Machine Learning Models;Dewangan;Lect. Notes Netw. Syst.,2023

5. Feature reduction techniques based code smell prediction;Yadav;I-Manag. J. Softw. Eng.,2022