Author:
Baradieh Khaled,Zainuri Muhammad,Kamari Nor,Yusof Yushaizad,Abdullah Huda,Zaman Mohd,Zulkifley Mohd
Abstract
Fault detection and classification in photovoltaic arrays are critical for increasing grid reliability and reducing the power losses. This paper assesses twelve machine learning classifiers for their effectiveness in detecting and classifying faults in Photovoltaic (PV) systems. Multiple validation methods were used for the algorithm evaluation, including K-fold, stratified K-fold, leave-one-out, and random split cross-validation approaches to ensure robust performance measures. The applied selection criterion of the top performing classifier are the accuracy, precision, recall, and computing efficiency. The utilized dataset, comprising samples with various fault kinds under diverse environmental conditions, received thorough preprocessing to enhance model training and assure generalizability. A large dataset of roughly 10,000 samples was utilized in this research for the model training and to run multiple random tests on new and unseen data. This dataset provides a fair representation of multiple fault types such as the healthy, Line to Line (LL), Line to Ground (LG), Partial Shading (PS), and Complete Shading faults (CS). The data preprocessing comprised normalization, handling of missing values by taking the average, and applying multiple statical analysis approaches to reduce the size of the features matrix and to improve the dependability of the model's predictions across varying operational circumstances. The results illustrate the best performance utilizing the optimized version of the Random Forest classifier, reaching an average fault detection accuracy of 100% and fault classification accuracy of 94.7%, the hyperparameters of the classifier was optimized using Random Search Optimization algorithm(RSO).