Stratified Sampling-Based Deep Learning Approach to Increase Prediction Accuracy of Unbalanced Dataset

Author:

Sadaiyandi Jeyabharathy1,Arumugam Padmapriya1,Sangaiah Arun Kumar23,Zhang Chao4ORCID

Affiliation:

1. Department of Computer Science, Alagappa University, Karaikudi 630003, India

2. International Graduate School of AI, National Yunlin University of Science and Technology, Douliu 64002, Taiwan

3. Department of Electrical and Computer Engineering, Lebanese American University, Byblos 13-5053, Lebanon

4. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China

Abstract

Due to the imbalanced nature of datasets, classifying unbalanced data classes and drawing accurate predictions is still a challenging task. Sampling procedures, along with machine learning and deep learning algorithms, are a boon for solving this kind of challenging task. This study’s objective is to use sampling-based machine learning and deep learning approaches to automate the recognition of rotting trees from a forest dataset. Method/Approach: The proposed approach successfully predicted the dead tree in the forest. Seven of the twenty-one features are computed using the wrapper approach. This research work presents a novel method for determining the state of decay of the tree. The process of classifying the tree’s state of decay is connected to the issue of unequal class distribution. When classes to be predicted are uneven, this frequently hides poor performance in minority classes. Using stratified sampling procedures, the required samples for precise categorization are prepared. Stratified sampling approaches are employed to generate the necessary samples for accurate prediction, and the precise samples with computed features are input into a deep learning neural network. Finding: The multi-layer feed-forward classifier produces the greatest results in terms of classification accuracy (91%). Novelty/Improvement: Correct samples are necessary for correct classification in machine learning approaches. In the present study, stratified samples were considered while deciding which samples to use as deep neural network input. It suggests that the proposed algorithm could accurately determine whether the tree has decayed or not.

Funder

Rashtriya Uchchatar Shiksha Abhiyan (RUSA) Phase 2.0

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Reference29 articles.

1. Silvi-Net—A dual-CNN approach for combined classification of tree species and standing dead trees from remote sensing data;Briechle;Int. J. Appl. Earth Obs. Geoinf.,2021

2. Increasing the performance of machine learning-based IDSs on an imbalanced and up-to-date dataset;Karatas;IEEE Access,2020

3. CSS: Handling imbalanced data by improved clustering with stratified sampling;Cao;Concurr. Comput. Pr. Exp.,2020

4. Classification for Glucose and Lactose Terahertz Spectrums Based on SVM and DNN Methods;Li;IEEE Trans. Terahertz Sci. Technol.,2020

5. Methods of Handling Unbalanced Datasets in Credit Card Fraud Detection;BRAIN. Broad Res. Artif. Intell. Neurosci.,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3