Ensembles of instance selection methods: A comparative study-Reference-Cited by-同舟云学术

Ensembles of instance selection methods: A comparative study

Published:2019-03-01 Issue:1 Volume:29 Page:151-168
ISSN:2083-8492
Container-title:International Journal of Applied Mathematics and Computer Science
language:en
Short-container-title:

Author:

Blachnik Marcin¹

Affiliation:

1. Department of Applied Informatics , Silesian University of Technology , Akademicka 2A, 44-100 Gliwice , Poland

Abstract

Abstract Instance selection is often performed as one of the preprocessing methods which, along with feature selection, allows a significant reduction in computational complexity and an increase in prediction accuracy. So far, only few authors have considered ensembles of instance selection methods, while the ensembles of final predictive models attract many researchers. To bridge that gap, in this paper we compare four ensembles adapted to instance selection: Bagging, Feature Bagging, AdaBoost and Additive Noise. The last one is introduced for the first time in this paper. The study is based on empirical comparison performed on 43 datasets and 9 base instance selection methods. The experiments are divided into three scenarios. In the first one, evaluated on a single dataset, we demonstrate the influence of the ensembles on the compression–accuracy relation, in the second scenario the goal is to achieve the highest prediction accuracy, and in the third one both accuracy and the level of dataset compression constitute a multi-objective criterion. The obtained results indicate that ensembles of instance selection improve the base instance selection algorithms except for unstable methods such as CNN and IB3, which is achieved at the expense of compression. In the comparison, Bagging and AdaBoost lead in most of the scenarios. In the experiments we evaluate three classifiers: 1NN, kNN and SVM. We also note a deterioration in prediction accuracy for robust classifiers (kNN and SVM) trained on data filtered by any instance selection methods (including the ensembles) when compared with the results obtained when the entire training set was used to train these classifiers.

Publisher

Walter de Gruyter GmbH

Subject

Applied Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Link

https://www.sciendo.com/pdf/10.2478/amcs-2019-0012

Reference45 articles.

1. Abdi, H. (2010). Holm’s sequential Bonferroni procedure, Encyclopedia of Research Design1(8): 620–627.

2. Aha, D., Kibler, D. and Albert, M. (1991). Instance-based learning algorithms, Machine Learning6(1): 37–66.

3. Alcalá-Fdez, J., Fernández, A., Luengo, J., Derrac, J., García, S., Sanchez, L. and Herrera, F. (2011). Keel data-mining software tool: Data set repository, integration of algorithms and experimental analysis framework, Journal of Multiple-Valued Logic & Soft Computing17: 255–287.

4. Arnaiz-González, Á., Blachnik, M., Kordos, M. and García-Osorio, C. (2016a). Fusion of instance selection methods in regression tasks, Information Fusion30: 69–79.

5. Arnaiz-González, Á., Díez-Pastor, J., Rodríguez, J.J. and García-Osorio, C.I. (2016b). Instance selection for regression: Adapting DROP, Neurocomputing201: 66–81.

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On the value of instance selection for bug resolution prediction performance;Journal of Software: Evolution and Process;2024-07-02

2. Bug Resolution Prediction for Open-Source Software Using Ensembles of Instance Selection Algorithms;2023 9th International Conference on Control, Decision and Information Technologies (CoDIT);2023-07-03

3. A novel binary horse herd optimization algorithm for feature selection problem;Multimedia Tools and Applications;2023-03-23

4. Instance selection using one‐versus‐all and one‐versus‐one decomposition approaches in multiclass classification datasets;Expert Systems;2023-01-02

5. Studies on Neural Networks as a Fusion Method for Dispersed Data with Noise;Lecture Notes in Information Systems and Organisation;2023