Unsupervised Feature Selection Methodology for Clustering in High Dimensionality Datasets-Reference-Cited by-同舟云学术

Unsupervised Feature Selection Methodology for Clustering in High Dimensionality Datasets

Published:2020-04-27 Issue:2 Volume:27 Page:30-41
ISSN:2175-2745
Container-title:Revista de Informática Teórica e Aplicada
language:
Short-container-title:RITA

Author:

Oliveira Marcos De Souza,Queiroz Sergio

Abstract

Feature selection is an important research area that seeks to eliminate unwanted features from datasets. Many feature selection methods are suggested in the literature, but the evaluation of the best set of features is usually performed using supervised metrics, where labels are required. In this work we propose a methodology that tries to aid data specialists to answer simple but important questions, such as: (1) do current feature selection methods give similar results? (2) is there is a consistently better method ? (3) how to select the m-best features? (4) as the methods are not parameter-free, how to choose the best parameters in the unsupervised scenario? and (5) given different options of selection, could we get better results if we fusion the results of the methods? If yes, how can we combine the results? We analyze these issues and propose a methodology that, based on some unsupervised methods, will make feature selection using strategies that turn the execution of the process fully automatic and unsupervised, in high-dimensional datasets. After, we evaluate the obtained results, when we see that they are better than those obtained by using the selection methods at standard configurations. In the end, we also list some further improvements that can be made in future works.

Publisher

Universidade Federal do Rio Grande do Sul

Subject

General Computer Science

Link

https://seer.ufrgs.br/rita/article/viewFile/96081/pdf

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Mining Self-Defined Business Process in Electronic Administration;International Journal of E-Services and Mobile Applications;2022-07-08

2. An improved feature selection method based on angle-guided multi-objective PSO and feature-label mutual information;Applied Intelligence;2022-06-01