A Multiple-Layer Machine Learning Architecture for Improved Accuracy in Sentiment Analysis-Reference-Cited by-同舟云学术

A Multiple-Layer Machine Learning Architecture for Improved Accuracy in Sentiment Analysis

Published:2019-04-27 Issue:3 Volume:63 Page:395-409
ISSN:0010-4620
Container-title:The Computer Journal
language:en
Short-container-title:

Author:

Shyamasundar L B¹,Jhansi Rani P¹

Affiliation:

1. Department of Computer Science and Engineering, CMR Institute of Technology, Bengaluru, Karnataka 560037, India

Abstract

Abstract Twitter is an online micro-blogging platform through which one can explore the hidden valuable and delightful information about the current context at any point of time, which also serves as a data source to carry out sentiment analysis. In this paper, the sentiments of large amount of tweets generated from Twitter in the form of big data have been analyzed using machine learning algorithms. A multi-tier architecture for sentiment classification is proposed in this paper, which includes modules such as tokenization, data cleaning, preprocessing, stemming, updated lexicon, stopwords and emoticon dictionaries, feature selection and machine learning classifier. Unigram and bigrams have been used as feature extractors together with χ2 (Chi-squared) and Singular Value Decomposition for dimensionality reduction together with two model types (Binary and Reg), with four types of scaling methods (No scaling, Standard, Signed and Unsigned) and represented them in three different vector formats (TF-IDF, Binary and Int). Accuracy is considered as the evaluation standard for random forest and bagged trees classification methods. Sentiments were analyzed through tokenization and having several stages of pre-processing and several combinations of feature vectors and classification methods. Through which it was possible to achieve an accuracy of 84.14%. Obtained results conclude that, the proposed scheme gives a better accuracy when compared with existing schemes in the literature.

Publisher

Oxford University Press (OUP)

Subject

General Computer Science

Link

http://academic.oup.com/comjnl/article-pdf/63/3/395/33106408/bxz038.pdf

Reference64 articles.

1. Twitter news stratification using random forest;Koyande;Int. J. Adv. Electron. Comput. Sci.,2015

2. Classification and regression tree method for forecasting;Muthu Visalatchi;Int. J. Comput. Sci. Mobile Comput.,2016

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A natural language processing approach for analyzing COVID-19 vaccination response in multi-language and geo-localized tweets;Healthcare Analytics;2023-11

2. Sentiment Analysis for Hotel Reviews: A Systematic Literature Review;ACM Computing Surveys;2023-09-15

3. Sentiment Analysis Techniques: A Review;Lecture Notes in Electrical Engineering;2023

4. A novel hybrid deep learning model for aspect based sentiment analysis;Concurrency and Computation: Practice and Experience;2022-12-19

5. Identifying Labor Market Competitors with Machine Learning Based on Maimai Platform;Applied Artificial Intelligence;2022-04-18