Abstract
Abstract
Class imbalance is one of many problems of customer churn datasets. One of the common problems is class overlap, where the data have a similar instance between classes. The prediction task of customer churn becomes more challenging when there is class overlap in the data training. In this research, we suggested a hybrid method based on tabular GANs, called CTGAN-ENN, to address class overlap and imbalanced data in datasets of customers that churn. We used five different customer churn datasets from an open platform. CTGAN is a tabular GAN-based oversampling to address class imbalance but has a class overlap problem. We combined CTGAN with the ENN under-sampling technique to overcome the class overlap. CTGAN-ENN reduced the number of class overlaps by each feature in all datasets. We investigated how effective CTGAN-ENN is in each machine learning technique. Based on our experiments, CTGAN-ENN achieved satisfactory results in KNN, GBM, and XGB machine learning performance for customer churn predictions. We compared CTGAN-ENN with common over-sampling and hybrid sampling methods, and CTGAN-ENN achieved outperform results compared with other sampling methods. We provide a time consumption algorithm between CTGAN and CTGAN-ENN. CTGAN-ENN achieved less time consumption than CTGAN. Our research work provides a new framework to handle customer churn prediction problems with several types of imbalanced datasets and can be useful in real-world data from customer churn prediction.
Publisher
Research Square Platform LLC
Reference35 articles.
1. Three-stage churn management framework based on DCN with asymmetric loss;Wen X;Expert Syst Appl,2022
2. A GAN-based hybrid sampling method for imbalanced customer classification;Zhu B;Inf Sci (N Y),2022
3. On Supervised Class-Imbalanced Learning: An Updated Perspective and Some Key Challenges;Das S;IEEE Trans Artif Intell,2022
4. Goodfellow IJ et al. ‘Generative Adversarial Networks’, Jun. 2014, [Online]. Available: http://arxiv.org/abs/1406.2661.
5. Huyen C. Designing Machine Learning Systems, First. Sebastopol: O’Reilly Media; 2022.