Attributes Reduction in Big Data-Reference-Cited by-同舟云学术

Attributes Reduction in Big Data

Published:2020-07-17 Issue:14 Volume:10 Page:4901
ISSN:2076-3417
Container-title:Applied Sciences
language:en
Short-container-title:Applied Sciences

Author:

Albattah Waleed^ORCID,Khan Rehan Ullah^ORCID,Khan Khalil^ORCID

Abstract

Processing big data requires serious computing resources. Because of this challenge, big data processing is an issue not only for algorithms but also for computing resources. This article analyzes a large amount of data from different points of view. One perspective is the processing of reduced collections of big data with less computing resources. Therefore, the study analyzed 40 GB data to test various strategies to reduce data processing. Thus, the goal is to reduce this data, but not to compromise on the detection and model learning in machine learning. Several alternatives were analyzed, and it is found that in many cases and types of settings, data can be reduced to some extent without compromising detection efficiency. Tests of 200 attributes showed that with a performance loss of only 4%, more than 80% of the data could be ignored. The results found in the study, thus provide useful insights into large data analytics.

Funder

Qassim University

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Link

https://www.mdpi.com/2076-3417/10/14/4901/pdf

Reference53 articles.

1. Big Data for Development: A Review of Promises and Challenges

2. Exascale computing and big data

3. Machine Learning With Big Data: Challenges and Approaches

4. Big Data Analytics framework for Peer-to-Peer Botnet detection using Random Forests

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Data reduction in big data: a survey of methods, challenges and future directions;International Journal of Data Science and Analytics;2024-07-10

2. Big Data Analytics: Deep Content-Based Prediction with Sampling Perspective;Computer Systems Science and Engineering;2023

3. A Rough Set Approach to Dimensionality Reduction for Performance Enhancement in Machine Learning;International Journal of Emerging Scientific Research;2022-10-31

4. Manoeuvre of Machine Learning Algorithms in Healthcare Sector with Application to Polycystic Ovarian Syndrome Diagnosis;Advances in Intelligent Systems and Computing;2022

5. Breast Cancer Survival Prediction Using Machine Learning;Computational Intelligence in Oncology;2022