Automated System for Movie Review Classification using BERT

Author:

Rana Shivani1,Kanji Rakesh1,Jain Shruti2

Affiliation:

1. Department of CSE & IT, Jaypee University of Information Technology, Solan, H.P., India

2. Department of ECE, Jaypee University of Information Technology, Solan, H.P., India

Abstract

Aims:Text classification emerged as an important approach to advancing Natural Language Processing (NLP) applications concerning the available text on the web. To analyze the text, many applications are proposed in the literature.Background:The NLP, with the help of deep learning, has achieved great success in automatically sorting text data in predefined classes, but this process is expensive and time-consuming.Objectives:To overcome this problem, in this paper, various Machine Learning techniques are studied & implemented to generate an automated system for movie review classification.Methodology:The proposed methodology uses the Bidirectional Encoder Representations of the Transformer (BERT) model for data preparation and predictions using various machine learning algorithms like XG boost, support vector machine, logistic regression, naïve Bayes, and neural network. The algorithms are analyzed based on various performance metrics like accuracy, precision, recall and F1 score.Result:The results reveal that the 2-hidden layer neural network outperforms the other models by achieving more than 0.90 F1 score in the first 15 epochs and 0.99 in just 40 epochs on the IMDB dataset, thus reducing the time to a great extent.Conclusion:100% accuracy is attained using a neural network, resulting in a 15% accuracy improvement and 14.6% F1 score improvement over logistic regression.

Publisher

Bentham Science Publishers Ltd.

Subject

General Computer Science

Reference38 articles.

1. Rana S.; Kanji R.; Jain S.; 5th International Conference on Multimedia, Signal Processing and Communication Technologies (IMPACT) Aligarh, India2022,1-5

2. Prashar N.; Sood M.; Jain S.; A novel cardiac arrhythmia processing using machine learning techniques. Int J Image Graph 2020,20(3),2050023

3. Kirti H.; Sohal, S Jain, “Multistage classification of arrhythmia and atrial fibrillation on long-term heart rate variability”, J. Engineer. Sci Technol 2020,15(2),1277-1295

4. Aggarwal C.C.; Zhai C.X.; A Survey of text classification algorithms Mining text data 2012,163-222

5. Mikolov T.; Sutskever I.; Chen K.; Corrado G.S.; Dean J.; Distributed representations of words and phrases and their compositionality. NIPS 2013,3111-3119

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3