Abstract
Abstract
Big Data lets companies generate new growth opportunities and whole new types of industries that can continually change and mushroom data from the market that can merge and evaluate. In today’s world, Apache Spark and Hadoop are the most popular and effective tools used to conduct big data analysis. This paper aims to perform two types of analysis-Sentiment Analysis of tweets and Tweet Analysis. For conducting sentiment analysis, we used a dataset created by Stanford University which contains 4 fields – ID of the tweet, Sentiment of the tweet, Sentiment Source and the Tweet itself. Two machine learning algorithms performance are compared these are Logistic Regression and Random Forest Classifier and observe which algorithm provides better accuracy and why. A data visualization tool -Tableau is used to view the sentiments of tweets by filtering the text of tweets by words and hashtags. For tweet analysis, we use Python and a library called Tweepy to download tweets from the Twitter API and perform analysis using another data visualization tool called Power BI. We conduct analysis on the basis of three parameters – user followers count, retweet count and favourite count. Finally, we draw a comparison on the performance and user-friendly nature of both data visualization tools – Power BI and Tableau.
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献