Abstract
Before building machine learning models, the dataset should be prepared to be a high quality dataset, we should give the model the best possible representation of the data. Different attributes may have different scales which possibly will increase the difficulty of the problem that is modeled. A model with varying scale values may suffers from poor performance during learning. Our study explores the usage of Numerical Data Scaling as a data pre-processing step with the purpose of how effectively these methods can be used to improve the accuracy of learning algorithms. In particular, three numerical data Scaling methods with four machine learning classifiers to predict disease severity were compared. The experiments were built on Coronavirus 2 (SARS-CoV-2) datasets which included 1206 patients who were admitted during the period between June 2020 and April 2021. The diagnosis of all cases was confirmed with RT-PCR. Basic demographic data and medical characteristics of all participants was collected. The reported results indicate that all techniques are performing well with Numerical Data Scaling and there are significant improvement in the models for unseen data. lastly, we can conclude that there are increase in the classifier performance while using scaling techniques. However, these methods help the algorithms to better understand learn the patterns in the dataset which help making accurate models
Reference44 articles.
1. M. M. Abualhaj, A. A. Abu-Shareha, M. O. Hiari, Y. Alrabanah, M. Al-Zyoud, and M. A. Alsharaiah, “A Paradigm for DoS Attack Disclosure using Machine Learning Techniques,” Int. J. Adv. Comput. Sci. Appl., vol. 13, no. 3, 2022.
2. D. A. P. Delzell, S. Magnuson, T. Peter, M. Smith, and B. J. Smith, “Machine learning and feature selection methods for disease classification with application to lung cancer screening image data,” Front. Oncol., vol. 9, p. 1393, 2019.
3. M. Kang and N. J. Jameson, “Machine learning: fundamentals,” Progn. Heal. Manag. Electron. Fundam. Mach. Learn. Internet Things, pp. 85–109, 2018.
4. R. Nisbet, G. Miner, and K. Yale, “Handbook of Statistical Analysis and Data Mining Applications.” Academic Press, Inc., 2017.
5. M. Kuhn and K. Johnson, Applied predictive modeling, vol. 26. Springer, 2013.