Impact of Effective Word Vectors on Deep Learning Based Subjective Classification of Online Reviews-Reference-Cited by-同舟云学术

Impact of Effective Word Vectors on Deep Learning Based Subjective Classification of Online Reviews

Published:2024-07-05 Issue: Volume: Page:736-747
ISSN:2788-7669
Container-title:Journal of Machine and Computing
language:en
Short-container-title:JMC

Author:

B Priya Kamath¹,M Geetha¹,U Dinesh Acharya¹,Nandi Ritika²,Urolagin Siddhaling³

Affiliation:

1. Department of Computer Science and Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, India.

2. Department of Obstetrics and Gynecology, Kasturba Medical College, Manipal Academy of Higher Education, Manipal, Karnataka, India.

3. Department of Computer Science, Birla Institute of Technology & Science, Pilani, Dubai International Academic City, Dubai, UAE.

Abstract

Sentiment Analysis tasks are made considerably simpler by extracting subjective statements from online reviews, thereby reducing the overhead of the classifiers. The review dataset encompasses both subjective and objective sentences, where subjective writing expresses the author's opinions, and objective text presents factual information. Assessing the subjectivity of review statements involves categorizing them as objective or subjective. The effectiveness of word vectors plays a crucial role in this process, as they capture the semantics and contextual cues of a subjective language. This study investigates the significance of employing sophisticated word vector representations to enhance the detection of subjective reviews. Several methodologies for generating word vectors have been investigated, encompassing both conventional approaches, such as Word2Vec and Global Vectors for word representation, and recent innovations, such as like Bidirectional Encoder Representations from Transformers (BERT), ALBERT, and Embeddings from Language Models. These neural word embeddings were applied using Keras and Scikit-Learn. The analysis focuses on Cornell subjectivity review data within the restaurant domain, and metrics evaluating performance, such as accuracy, F1-score, recall, and precision, are assessed on a dataset containing subjective reviews. A wide range of conventional vector models and deep learning-based word embeddings are utilized for subjective review classification, frequently in combination with deep learning architectures like Long Short-Term Memory (LSTM). Notably, pre-trained BERT-base word embeddings exhibited exceptional accuracy of 96.4%, surpassing the performance of all other models considered in this study. It has been observed that BERT-base is expensive because of its larger structure.

Publisher

Anapub Publications

Link

https://anapub.co.ke/journals/jmc/jmc_pdf/2024/jmc_volume_4-issue_3/JMC202404069.pdf

Reference43 articles.

1. M. Arslan and C. Cruz, “Leveraging NLP approaches to define and implement text relevance hierarchy framework for business news classification,” Procedia Computer Science, vol. 225, pp. 317–326, 2023, doi: 10.1016/j.procs.2023.10.016.

2. D. Jannach, “Evaluating conversational recommender systems,” Artificial Intelligence Review, vol. 56, no. 3, pp. 2365–2400, Jul. 2022, doi: 10.1007/s10462-022-10229-x.

3. Cavnar, William B., and John M. Trenkle. "N-gram-based text categorization." Proceedings of SDAIR-94, 3rd annual symposium on document analysis and information retrieval. Vol. 161175. 1994.

4. Sarkar, Atanu, Anil Bikash Chowdhury, and Mauparna Nandan. "Classification of Online Fake News Using N-Gram Approach and Machine Learning Techniques." Doctoral Symposium on Human Centered Computing. Singapore: Springer Nature Singapore, 2023.

5. Das, Mamata, and P. J. A. Alphonse. "A comparative study on tf-idf feature weighting method and its analysis using unstructured dataset." arXiv preprint arXiv:2308.04037 (2023).