DPDRC, a Novel Machine Learning Method about the Decision Process for Dimensionality Reduction before Clustering-Reference-Cited by-同舟云学术

DPDRC, a Novel Machine Learning Method about the Decision Process for Dimensionality Reduction before Clustering

Published:2021-12-29 Issue:1 Volume:3 Page:1-21
ISSN:2673-2688
Container-title:AI
language:en
Short-container-title:AI

Author:

Dessureault Jean-Sébastien^ORCID,Massicotte Daniel^ORCID

Abstract

This paper examines the critical decision process of reducing the dimensionality of a dataset before applying a clustering algorithm. It is always a challenge to choose between extracting or selecting features. It is not obvious to evaluate the importance of the features since the most popular methods to do it are usually intended for a supervised learning technique process. This paper proposes a novel method called “Decision Process for Dimensionality Reduction before Clustering” (DPDRC). It chooses the best dimensionality reduction method (selection or extraction) according to the data scientist’s parameters and the profile of the data, aiming to apply a clustering process at the end. It uses a Feature Ranking Process Based on Silhouette Decomposition (FRSD) algorithm, a Principal Component Analysis (PCA) algorithm, and a K-means algorithm along with its metric, the Silhouette Index (SI). This paper presents five scenarios based on different parameters. This research also aims to discuss the impacts, advantages, and disadvantages of each choice that can be made in this unsupervised learning process.

Publisher

MDPI AG

Link

https://www.mdpi.com/2673-2688/3/1/1/pdf

Reference30 articles.

1. Dynamic Programming;Bellman,1957

2. The Curse of Dimensionality in Data Mining and Time Series Prediction

3. A survey of feature selection and feature extraction techniques in machine learning

4. Spectral unmixing|IEEE Journals & Magazine|IEEE Xplorehttps://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=79

5. Clustering of European Smart Cities to Understand the Cities’ Sustainability Strategies

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. DPDR: A Novel Machine Learning Method for the Decision Process for Dimensionality Reduction;SN Computer Science;2023-12-21

2. Power quality disturbances classification using autoencoder and radial basis function neural network;International Journal of Emerging Electric Power Systems;2023-09-25

3. $$AI^{2}$$: the next leap toward native language-based and explainable machine learning framework;Automated Software Engineering;2023-09-24

4. AI2: a novel explainable machine learning framework using an NLP interface;Proceedings of the 2023 8th International Conference on Machine Learning Technologies;2023-03-10

5. Active Power Load Data Dimensionality Reduction Using Autoencoder;Power Quality in Microgrids: Issues, Challenges and Mitigation Techniques;2023