Homogenous Ensemble Boosting Approach to Improve the Consistency in the Accuracy of Text Data Classification-Reference-Cited by-同舟云学术

Homogenous Ensemble Boosting Approach to Improve the Consistency in the Accuracy of Text Data Classification

Published:2023-09-14 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Azam Muhammad¹,Sabah Fahad¹,Raheem Abdul²,Ahmad Nadeem¹,Irfan Danish¹,Sarwar Raheem³

Affiliation:

1. Superior University

2. Beijing University of Technology

3. OTEHM, Manchester Metropolitan University

Abstract

Abstract The rapid growth of the internet in recent years has produced an enormous amount of data. The significant chunk of this data is unstructured. This unstructured data requires critical analysis and modelling to become useful for decision making. Due to the wild spread of internet across the globe, several applications are being developed every day. These applications have direct interaction with end-users, and users can provide their opinions, sentiments, reviews etc. about the products, services, events, etc. These sentiments, reviews and opinions are very useful for individuals, organizations, businesses, and governments for future decision making. Surveys from last few years confer those online opinions have more prominent financial effect compared to traditional media advertisement. The significant task of sentiment analysis is used to locate the useful information from the client sentiment. While this substance is intended to be valuable, most of this client produced content requires using the data mining methods and sentiment analysis. However, a few difficulties are confronting sentiment analysis. Sentiment analysis includes the applications of natural language processing and text analysis methods to recognize and separate the useful information from text data. Machine learning techniques are widely used for sentiment classification. In this paper, we provide a deep understanding of different machine learning systems for sentiment classification. An extensive study of homogenous ensemble-based machine learning techniques in the domain of sentiment classification has been carried out to enhance the efficiency and consistency by implementing various learning algorithms to gain better accuracy that can be attained from any of the individual learning algorithms. Our methodology in this paper is to explore the whole process from data preprocessing to classification accuracy. Various preprocessing steps are applied to selected text data to prepare data for classification. Many classification models (NB, NNET, KNN, RPART, SVM, LDA, CTREE) are explored from a different family of classifiers for classification purpose. Lastly, homogeneous ensemble techniques (Boosting (GBM) and Bagging (RF)) are used and compared with individual classifiers. And results obtained shows that Boosting ensemble model is more consistent and accurate than all other discussed models.

Publisher

Research Square Platform LLC

Reference26 articles.

1. Differences in resource use and costs of dementia care between European countries: Baseline data from the ICTUS study;Gustavsson A;The Journal Of Nutrition, Health & Aging,2010

2. L. Piyathilaka and S. Kodagoda, “Human activity recognition for domestic robots,” in Proc. Field and Service Robotics: Results of the 9th International Conference, Fujisawa, Germany, pp. 395–408, 2015.

3. H. Admoni and B. Scassellati, “Data-driven model of nonverbal behavior for socially assistive human-robot interactions,” in Proc. the 16th International Conference on Multimodal Interaction, New York, USA, pp. 196–199, 2014.

4. Wearables and social signal processing for smarter public presentations;Mihoub A;ACM Transactions on Interactive Intelligent,2019

5. Exploring the value of online product reviews in forecasting sales: The case of motion pictures;Dellarocas C;Journal of Interactive Marketing,2007

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Exploring Sleep Disorder and Lifestyle Analysis Through Data Preprocessing and Ensemble Learning Techniques;2024 2nd International Conference on Sustainable Computing and Smart Systems (ICSCSS);2024-07-10