Abstract
Machine Learning techniques such as Regression have been developed to investigate associations between risk factor and disease in multivariable analysis. However, multicollinearity amongst explanatory variables becomes a problem which makes interpretation more difficult and degrade the predictability of the model. This study compared Bridge and Elastic Net regressions in handling multicollinearity in multivariable analysis. Wisconsin Diagnostic Breast Cancer data was used for comparison for model fit and in handling multicollinearity between the regression techniques. Comparison were made using MSE, RMSE, R^2, VIF, AIC and BIC for efficiency. Scatter plots was employed to show fitted regression models. The results from the study show that, the Bridge regression performed better in solving the problem of multicollinearity with VIF value of 1.182296 when 𝛾 = 2 compared to Elastic Net regression with a VIF value of 1.204298 respectively. In comparison for best model fit, Bridge regression with 𝛾 = 0.5 performed better with MSE of 11.58667, AIC value of 258.9855 and BIC of 277.2217 respectively. Consequently, we can conclude that both the Bridge and Elastic Net Regressions can be used in handling multicollinearity problems that exist in multivariable regression analysis. Information on machine learning such as this, can help those in the medical fields to improve diagnosis, narrow clinical trials and biopsy to proffer effective treatment.
Publisher
African - British Journals
Subject
General Medicine,General Chemistry
Reference15 articles.
1. [1] Abhishek, T. (2021). Comparative Assessment of Regression Models Based On Model Evaluation Metrics. International Research Journal of Engineering and Technology, 9(8), 853-860.
2. [2] Batterham, A., Tolfrey, K., & George, K. (1997). Nevill’s explanation of Kleiber’s 0.75 mass exponent: an artifact of collinearity problems in least squares models? Journal of Applied Physiology, 82, 693-697.
3. [3] Parkin, I., Lydiate, D., & Trick, M. (2002). Assessing the level of collinearity between Arabidopsis thaliana and Brassica napus for A. thaliana chromosome 5. Genome, 45, 356- 366.
4. [4] Herawati, N., Nisa, K., Setiawan, E., & Nusyirwan, T. (2018). Regularized Multiple Regression Methods to Deal with Severe Multicollinearity. International Journal of Statistics and Applications, 8(4), 167-172. DOI: 10.5923/j.statistics.20180804.02
5. [5] Fox, J. (2015). Applied regression analysis and generalized linear models. Sage publications.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献