Abstract
Abstract
This paper explores whether there are anomalies in high-frequency stock trades in South Africa. Using the JSE daily data from 2010 to 2022, we hypothesize that there are complexities associated with high-frequency stock data, which carries hidden important information, and this information can be helpful to investors. Under high-frequency trading settings, traders should be able to quickly, efficiently, and profitably detach and disseminate information from complex sets. Given any stock portfolio, they should be able to separate the risky stocks from the less risky and rich ones before making any investment decisions. However, this is less attainable in emerging and less technical economies like South Africa, which still rely on traditional trading norms (managerial expertise and emotional trading). Therefore, this paper aims to provide a powerful solution to this fundamental problem. Firstly, we study the time-stamped behavior of stock prices using a long-term memory model (LSTM). We note that JSE stock prices are non-stationary, have fat tails, and have a long memory, which exhibits the stocks' ARCH effects and volatility traits. Secondly, we employ the Random Forest algorithm to capture useful stock features further and classify the data quickly. We trained the model hourly to capture the anomaly data, classify trades, and convert them to profitable trades. From this model, we managed to classify stock trades into three categories: high premium (less risky), premium(satisfactory), and doubtful (high risk). Ideally, volatile stocks with low returns are riskier (doubtful) and true otherwise. We evaluate our RF model using OOB error and cross-validation. Minor prediction errors were reported with increased trees, signaling its robustness in capturing the embedded stylized facts about stock trades.
Publisher
Research Square Platform LLC
Reference25 articles.
1. Angiulli, F. and Pizzuti, C. (2002) Fast Outlier Detection in High Dimensional Spaces. In: Tapio, E., Heikki, M. and Hannu, T., Eds., Principles of Data Mining and 230 Knowledge Discovery, Springer, Rende, 15–27. http://dx.doi.org/10.1007/3-540-45681-3_2.
2. Generalized autoregressive conditional heteroskedasticity;Bollerslev T;Journal of Econometrics,1986
3. Bagging predictors;Breiman L;Machine Learning,1996
4. Breiman (2001). Random Forests. Machine Learning, 45, 5–32, 2001.
5. Campbell, G., Polk, & Turley, (2018). An intertemporal CAPM with stochastic volatility. Journal of Financial Economics Volume 128, Issue 2, May 2018, Pages 207–233.