Affiliation:
1. TOKAT GAZİOSMANPAŞA ÜNİVERSİTESİ
2. ONDOKUZ MAYIS UNIVERSITY
Abstract
Multivariate Adaptive Regression Splines (MARS) is a supervised learning model in machine learning, not obtained by an ensemble learning method. Ensemble learning methods are gathered from samples comprising hundreds or thousands of learners that serve the common purpose of improving the stability and accuracy of machine learning algorithms. This study presented REMARS (Random Ensemble MARS), a new MARS model selection approach obtained using the Random Forest (RF) algorithm. 200 training and test data set generated via the Bagging method were analysed in the MARS analysis engine. At the end of the analysis, two different MARS model sets were created, one yielding the smallest Mean Square Error for the test data (Test MSE) and the other yielding the smallest Generalised Cross-Validation (GCV) value. The best model was estimated for both Test MSE and GCV criteria by examining the error of measurement criteria, variable importance averages, and frequencies of the knot values for each model. Eventually, a new model was obtained via the ensemble learning method, i.e., REMARS, that yields result as good as the MARS model obtained from the original data set. The MARS model, which works better in the larger data set, provides more reliable results with smaller data sets utilising the proposed method.
Reference44 articles.
1. S. Theodoridis, Machine Learning a Bayesian and Optimisation Perspective, Academic Press of Elsevier, 125 London Wall, London, 2015.
2. S. Suthaharan, Machine Learning Models and Algorithms for Big Data Classification, Springer International Publishing, New York, 2016.
3. T. Hastie, R. Tibshirani, J. Friedman, The Elements of Statistical Learning: Data Mining, Inference and Prediction, Springer Series in Statistics, Stanford, California, 2001.
4. T. K. Ho, Random Decision Forests, Proceedings of 3rd International Conference on Document Analysis and Recognition (IEEE), Montreal, Canada, 1995, pp. 278–282.
5. T. K. Ho, The Random Subspace Method for Constructing Decision Forests, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (2) (1998) 832–844.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献