Summarizing Online Movie Reviews: A Machine Learning Approach to Big Data Analytics

Author:

Khan Atif1,Gul Muhammad Adnan1,Uddin M. Irfan2,Ali Shah Syed Atif3ORCID,Ahmad Shafiq4ORCID,Al Firdausi Muhammad Dzulqarnain4,Zaindin Mazen5

Affiliation:

1. Department of Computer Science, Islamia College Peshawar, Peshawar, Pakistan

2. Institute of Computing, Kohat University of Science and Technology, Kohat, Pakistan

3. Faculty of Engineering and Information Technology, Northern University, Nowshehra, Pakistan

4. Industrial Engineering Department, College of Engineering, King Saud University, P.O. Box 800, Riyadh 11421, Saudi Arabia

5. Department of Statistics and Operations Research, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia

Abstract

Information is exploding on the web at exponential pace, and online movie review over the web is a substantial source of information for online users. However, users write millions of movie reviews on regular basis, and it is not possible for users to condense the reviews. Classification and summarization of reviews is a difficult task in computational linguistics. Hence, an automatic method is demanded to summarize the vast amount of movie reviews, and this method will permit the users to speedily distinguish between positive and negative features of a movie. This work has proposed a classification and summarization method for movie reviews. For movie review classification, bag-of-words feature extraction technique is used to extract unigrams, bigrams, and trigrams as a feature set from given review documents and represent the review documents as a vector. Next, the Na¨ıve Bayes algorithm is employed to categorize the movie reviews (signified as a feature vector) into negative and positive reviews. For the task of movie review summarization, word2vec model is used to extract features from classified movie review sentences, and then semantic clustering technique is used to cluster semantically related review sentences. Different text features are employed to compute the salience score of all review sentences in clusters. Finally, the best-ranked review sentences are picked based on top salience scores to form a summary of movie reviews. Empirical results indicate that the suggested machine learning approach performed better than benchmark summarization approaches.

Funder

Deanship of Scientific Research, King Saud University

Publisher

Hindawi Limited

Subject

Computer Science Applications,Software

Reference61 articles.

1. Mining and summarizing customer reviews;M. Hu

2. Movie review summarization and sentiment analysis using rapidminer;A. F. Alsaqer

3. Movie review mining and summarization;L. Zhuang

4. Survey on opinion mining and summarization of user reviews on web;V. B. Raut;International Journal of Computer Science and Information Technologies,2014

5. Movie Rating and Review Summarization in Mobile Environment

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3