Abstract
Sentiment analysis of news headlines is an important factor that investors consider when making investing decisions. We claim that the sentiment analysis of financial news headlines impacts stock market values. Hence financial news headline data are collected along with the stock market investment data for a period of time. Using Valence Aware Dictionary and Sentiment Reasoning (VADER) for sentiment analysis, the correlation between the stock market values and sentiments in news headlines is established. In our experiments, the data on stock market prices are collected from Yahoo Finance and Kaggle. Financial news headlines are collected from the Wall Street Journal, Washington Post, and Business-Standard website. To cope with such a massive volume of data and extract useful information, various embedding methods, such as Bag-of-words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF), are employed. These are then fed into machine learning models such as Naive Bayes and XGBoost as well as deep learning models such as Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM). Various natural language processing, andmachine and deep learning algorithms are considered in our study to achieve the desired outcomes and to attain superior accuracy than the current state-of-the-art. Our experimental study has shown that CNN (80.86%) and LSTM (84%) are the best performing models in relation to machine learning models, such as Support Vector Machine (SVM) (50.3%), Random Forest (67.93%), and Naive Bayes (59.79%). Moreover, two novel methods, BERT and RoBERTa, were applied with the expectation of better performance than all the other models, and they did exceptionally well by achieving an accuracy of 90% and 88%, respectively.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference64 articles.
1. AP CorpComm on Twitter: “Advisory: @AP Twitter Account Has Been Hacked. Tweet About an Attack at the White House Is False. We Will Advise More as soon as Possible.”/Twitterhttps://twitter.com/ap_corpcomm/status/326750712669282306
2. VADER: A parsimonious rule-based model for sentiment analysis of social media text;Hutto;Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media,2014
3. BERT: Pre-training of deep bidirectional transformers for language understanding;Devlin;arXiv,2018
4. RoBERTa: A Robustly Optimized BERT Pretraining Approach;Liu;arXiv,2019
5. Sentiment analysis of Twitter data for predicting stock market movements;Pagolu;Proceedings of the International Conference on Signal Processing, Communication, Power and Embedded System (SCOPES),2016
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献