Sentiment Analysis of Text Reviews Using Lexicon-Enhanced Bert Embedding (LeBERT) Model with Convolutional Neural Network

Author:

Mutinda James1ORCID,Mwangi Waweru2,Okeyo George3

Affiliation:

1. Department of Research and Consultancy, Kenya School of Government-Embu Campus, Embu P.O. Box 402-60100, Kenya

2. Department of Computing, Jomo Kenyatta University of Agriculture & Technology, Nairobi P.O. Box 62000-00200, Kenya

3. College of Engineering, Carnegie Mellon University Africa, Kigali Innovation City, Bumbogo, Kigali BP 6150, Rwanda

Abstract

Sentiment analysis has become an important area of research in natural language processing. This technique has a wide range of applications, such as comprehending user preferences in ecommerce feedback portals, politics, and in governance. However, accurate sentiment analysis requires robust text representation techniques that can convert words into precise vectors that represent the input text. There are two categories of text representation techniques: lexicon-based techniques and machine learning-based techniques. From research, both techniques have limitations. For instance, pre-trained word embeddings, such as Word2Vec, Glove, and bidirectional encoder representations from transformers (BERT), generate vectors by considering word distances, similarities, and occurrences ignoring other aspects such as word sentiment orientation. Aiming at such limitations, this paper presents a sentiment classification model (named LeBERT) combining sentiment lexicon, N-grams, BERT, and CNN. In the model, sentiment lexicon, N-grams, and BERT are used to vectorize words selected from a section of the input text. CNN is used as the deep neural network classifier for feature mapping and giving the output sentiment class. The proposed model is evaluated on three public datasets, namely, Amazon products’ reviews, Imbd movies’ reviews, and Yelp restaurants’ reviews datasets. Accuracy, precision, and F-measure are used as the model performance metrics. The experimental results indicate that the proposed LeBERT model outperforms the existing state-of-the-art models, with a F-measure score of 88.73% in binary sentiment classification.

Publisher

MDPI AG

Subject

Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science

Reference41 articles.

1. Text Classification Using Novel Term Weighting Scheme-Based Improved TF-IDF for Internet Media Reports;Jiang;Math. Probl. Eng.,2021

2. Onan, A., and Üniversitesi, I.K. (2021). Ensemble of Classifiers and Term Weighting Schemes for Sentiment Analysis in Turkish. Sci. Res. Commun.

3. An overview on research challenges in opinion mining and sentiment analysis;Kalarani;Int. J. Innov. Res. Comput. Commun. Eng.,2015

4. Social media data analytics for business decision making system to competitive analysis;Yang;Inf. Process. Manag.,2021

5. Rao, L. (2022). Sentiment Analysis of English Text with Multilevel Features. Sci. Program.

Cited by 42 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3