Affiliation:
1. Laboratory LIMTIC, Higher Institute of Computer Science, University of Tunis El Manar, 2 Rue Abou Raihan El Bayrouni, 2080 Ariana, Tunisia
Abstract
To uncover an appropriate latent subspace for data representation, we propose in this paper a new extension of the random forests method which leads to the unsupervised feature selection called Feature Selection with Random Forests (RFS) based on SOM variants that evaluates the out-of-bag feature importance from a set of partitions. Every partition is created using a several bootstrap samples and a random features subset. We obtain empirical results on 19 benchmark datasets specifying that RFS, boosted with a recursive feature elimination (RFE) method, can lead to important enhancement in terms of clustering accuracy with a very restricted subset of features. Simulations are performed on nine different benchmarks, including face data, handwritten digit data, and document data. Promising experimental results and theoretical analysis prove the efficiency and effectiveness of the proposed method for feature selection in comparison with competitive representative algorithms.
Publisher
World Scientific Pub Co Pte Lt
Subject
Computer Science Applications,Theoretical Computer Science,Software
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献