ENSEMBLE META CLASSIFIER WITH SAMPLING AND FEATURE SELECTION FOR DATA WITH MULTICLASS IMBALANCE PROBLEM

Author:

Sainin Mohd Shamrie1,Alfred Rayner1,Ahmad Faudziah2

Affiliation:

1. Faculty of Computing and Informatics, Universiti Malaysia Sabah, Malaysia

2. School of Computing, Universiti Utara Malaysia, Malaysia

Abstract

Ensemble learning by combining several single classifiers or another ensemble classifier is one of the procedures to solve the imbalance problem in multiclass data. However, this approach still faces the question of how the ensemble methods obtain their higher performance. In this paper, an investigation was carried out on the design of the meta classifier ensemble with sampling and feature selection for multiclass imbalanced data. The specific objectives were: 1) to improve the ensemble classifier through data-level approach (sampling and feature selection); 2) to perform experiments on sampling, feature selection, and ensemble classifier model; and 3 ) to evaluate t he performance of the ensemble classifier. To fulfil the objectives, a preliminary data collection of Malaysian plants’ leaf images was prepared and experimented, and the results were compared. The ensemble design was also tested with three other high imbalance ratio benchmark data. It was found that the design using sampling, feature selection, and ensemble classifier method via AdaboostM1 with random forest (also an ensemble classifier) provided improved performance throughout the investigation. The result of this study is important to the on-going problem of multiclass imbalance where specific structure and its performance can be improved in terms of processing time and accuracy.

Publisher

UUM Press, Universiti Utara Malaysia

Subject

General Mathematics,General Computer Science

Reference53 articles.

1. Ali, H., Salleh, M. N. M., Saedudin, R., Hussain, K., & Mushtaq, M.

2. F. (2019). Imbalance class problems in data mining: A review. Indonesian Journal of Electrical Engineering and Computer Science, 14(3), 1560–1571. https://doi.org/ 10.11591/ijeecs. v14.i3.pp1552-1563

3. Álvarez, J. D., Matias-Guiu, J. A., Cabrera-Martín, M. N., Risco- Martín, J. L., & Ayala, J. L. (2019). An application of machine learning with feature selection to improve diagnosis and classification of neurodegenerative disorders. BMC Bioinformatics, 20(491). https://doi.org/10.1186/s12859-019-

4. 3027-7

5. Barati, M., Abdullah, A., Mahmod, R., Mustapha, N., & Udzir, N. I. (2013). Features selection for IDS in encrypted traffic using genetic algorithm. In Proceedings of the 4th International Conference on Computing and Informatics (pp. 279–285). http://psasir.upm.edu.my/id/eprint/41307

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3