K-means clustering based filter feature selection on high dimensional data-Reference-Cited by-同舟云学术

K-means clustering based filter feature selection on high dimensional data

Published:2016-03-31 Issue:1 Volume:2 Page:38
ISSN:2548-3161
Container-title:International Journal of Advances in Intelligent Informatics
language:
Short-container-title:Int. J. Adv. Intell. Informatics

Author:

Ismi Dewi Pramudi,Panchoo Shireen,Murinto Murinto

Abstract

With hundreds or thousands of features in high dimensional data, computational workload is challenging. In classification process, features which do not contribute significantly to prediction of classes, add to the computational workload. Therefore the aim of this paper is to use feature selection to decrease the computation load by reducing the size of high dimensional data. Selecting subsets of features which represent all features were used. Hence the process is two-fold; discarding irrelevant data and choosing one feature that representing a number of redundant features. There have been many studies regarding feature selection, for example backward feature selection and forward feature selection. In this study, a k-means clustering based feature selection is proposed. It is assumed that redundant features are located in the same cluster, whereas irrelevant features do not belong to any clusters. In this research, two different high dimensional datasets are used: 1) the Human Activity Recognition Using Smartphones (HAR) Dataset, containing 7352 data points each of 561 features and 2) the National Classification of Economic Activities Dataset, which contains 1080 data points each of 857 features. Both datasets provide class label information of each data point. Our experiment shows that k-means clustering based feature selection can be performed to produce subset of features. The latter returns more than 80% accuracy of classification result.

Publisher

Universitas Ahmad Dahlan, Kampus 3

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Human-Computer Interaction

Link

http://ijain.org/index.php/IJAIN/article/viewFile/54/pdf_7

Cited by 17 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Class-incremental Learning for Time Series: Benchmark and Evaluation;Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining;2024-08-24

2. An ensemble maximal feature subset selection for smartphone based human activity recognition;Journal of Network and Computer Applications;2024-06

3. Mapping of crime prone areas in Batubara districts using the k-means method;AIP Conference Proceedings;2024

4. Human Activity Recognition with Smartphone Sensors Using RNN;2023 International Conference on Ambient Intelligence, Knowledge Informatics and Industrial Electronics (AIKIIE);2023-11-02

5. Enhancing the accuracy of electroencephalogram-based emotion recognition through Long Short-Term Memory recurrent deep neural networks;Frontiers in Human Neuroscience;2023-10-10