Exploring performance of clustering methods on document sentiment analysis-Reference-Cited by-同舟云学术

Exploring performance of clustering methods on document sentiment analysis

Published:2016-07-10 Issue:1 Volume:43 Page:54-74
ISSN:0165-5515
Container-title:Journal of Information Science
language:en
Short-container-title:Journal of Information Science

Author:

Ma Baojun¹,Yuan Hua²,Wu Ye³

Affiliation:

1. School of Economics and Management, Beijing University of Posts and Telecommunications, China

2. School of Management and Economics, University of Electronic Science and Technology of China, China

3. School of Science, Beijing University of Posts and Telecommunications, China

Abstract

Clustering is a powerful unsupervised tool for sentiment analysis from text. However, the clustering results may be affected by any step of the clustering process, such as data pre-processing strategy, term weighting method in Vector Space Model and clustering algorithm. This paper presents the results of an experimental study of some common clustering techniques with respect to the task of sentiment analysis. Different from previous studies, in particular, we investigate the combination effects of these factors with a series of comprehensive experimental studies. The experimental results indicate that, first, the K-means-type clustering algorithms show clear advantages on balanced review datasets, while performing rather poorly on unbalanced datasets by considering clustering accuracy. Second, the comparatively newly designed weighting models are better than the traditional weighting models for sentiment clustering on both balanced and unbalanced datasets. Furthermore, adjective and adverb words extraction strategy can offer obvious improvements on clustering performance, while strategies of adopting stemming and stopword removal will bring negative influences on sentiment clustering. The experimental results would be valuable for both the study and usage of clustering methods in online review sentiment analysis.

Publisher

SAGE Publications

Subject

Library and Information Sciences,Information Systems

Link

http://journals.sagepub.com/doi/pdf/10.1177/0165551515617374

Reference73 articles.

1. Opinion Mining and Sentiment Analysis

2. Thumbs up?

3. Sentiment analysis: A combined approach

4. From Frequency to Meaning: Vector Space Models of Semantics

5. Document clustering based on non-negative matrix factorization

Cited by 38 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Big Data Anomaly Prediction Algorithm of Smart City Power Internet of Things Based on Parallel Random Forest;Journal of Testing and Evaluation;2024-02-15

2. SPSO-EFVM: A Particle Swarm Optimization- Based Ensemble Fusion Voting Model for Sentence-Level Sentiment Analysis;IEEE Access;2024

3. A Unified Deep Learning Framework for Sentiment Analysis of Reviews;Studies in Computational Intelligence;2024

4. Identification of domain-specific euphemistic tweets using clustering;International Journal of Information Technology;2023-11-22

5. A two-stage unsupervised sentiment analysis method;Multimedia Tools and Applications;2023-03-08