Abstract
The Web is one of the richest sources for gathering of consumer reviews and opinions. There are many websites which contains opinions of the customers in the form of reviews, blogs, discussion groups, and forums. This project focuses on customer reviews on the restaurants. It predicts whether the given comment is either a positive or negative using supervised machine learning techniques. The project makes use of a dataset from Kaggle website. The dataset consists of comment and the type of comment (i.e., either positive or negative). This project makes a study on classification algorithm and text mining approaches to identify the type of comment. Firstly, the data set which is taken is made free from duplicates. That is duplicates are removed then it is followed by text pre-processing that involves removal of punctuation marks, stop word removal and then conversion of the whole text into vector format would takes place. The conversion from text to vector is an essential step because the English cannot be directly used for the analysis as we are working with linear algebra. So, as to work with this data, it has to be converted to vector format and we are using CountVectorizer to convert the data to the vector format. And finally comes the classification part. We are using Naive Bayes algorithm for this classification. This classification makes the data set into two parts as mentioned above. Here we are taking 70 percent of the data to be train data set and 30 percent of the data to be test data set
Publisher
Blue Eyes Intelligence Engineering and Sciences Engineering and Sciences Publication - BEIESP
Subject
Management of Technology and Innovation,General Engineering
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Enhanced Player Discovery via Machine Learning;International Journal of Advanced Research in Science, Communication and Technology;2024-05-24
2. DFR-TSD: A Sustainable Deep Learning Based Framework for Sustainable Robust Traffic Sign Detection under Challenging Weather Conditions;E3S Web of Conferences;2023
3. Go-Food Sentiment Analysis Using Twitter Data, Compared the Performance of the Random Forest Algorithm with That of the Linear Support Vector Classifier;Proceedings of the First Mandalika International Multi-Conference on Science and Engineering 2022, MIMSE 2022 (Informatics and Computer Science);2022
4. Crop Yield Prediction;International Journal of Scientific Research in Computer Science, Engineering and Information Technology;2021-06-15
5. Knowledge Extraction from Twitter Towards Infectious Diseases in Spanish;Communications in Computer and Information Science;2020