Affiliation:
1. School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA
Abstract
There is tremendous value in the ability to predict stock market trends and outcomes. The public sentiment surrounding a stock is unquestionably a vital factor contributing to the rise or fall of a stock price. This paper aims to detail how data from public sentiment can be integrated into traditional stock analyses and how these analyses can then be used to make predictions of stock price trends. Headlines from seven news publications and conversations from Yahoo! Finance’s conversations forum were processed by the Valence Aware Dictionary and sEntiment Reasoner (VADER) natural language processing package to determine numerical polarities that represent a positive, negative, or neutral public sentiment around a stock ticker. The resulting polarities were paired with popular stock-table metrics to create a dataset for a Logistic Regression machine learning model. The model was trained on approximately 4400 major stocks to determine a binary “Buy” (1) or “Not Buy” (0) recommendation for each stock. We present approaches for a variety of machine learning models — Logistic Regression, Random Forest Classifiers, and Extreme Gradient Boosting models and do a statistical comparison on the balanced accuracies of each machine learning model. This paper aims to detail our understanding of how one can leverage the public sentiment of a stock to gauge its future value.
Publisher
World Scientific Pub Co Pte Ltd