Affiliation:
1. Department of Computer Science, Umm Al-Qura University, Makkah 24243, Saudi Arabia
2. REsearch Groups in Intelligent Machines (REGIM-Lab), National Engineering School of Sfax, University of Sfax, Sfax 3038, Tunisia
Abstract
The spread of COVID-19 has affected more than 200 countries and has caused serious public health concerns. The infected cases are on the increase despite the effectiveness of the vaccines. An efficient and quick surveillance system for COVID-19 can help healthcare decision-makers to contain the virus spread. In this study, we developed a novel framework using machine learning (ML) models capable of detecting COVID-19 accurately at an early stage. To estimate the risks, many models use social networking sites (SNSs) in tracking the disease outbreak. Twitter is one of the SNSs that is widely used to create an efficient resource for disease real-time analysis and can provide an early warning for health officials. We introduced a pipeline framework of outbreak prediction that incorporates a first-step hybrid method of word embedding for tweet classification. In the second step, we considered the classified tweets with external features such as vaccine rate associated with infected cases passed to machine learning algorithms for daily predictions. Thus, we applied different machine learning models such as the SVM, RF, and LR for classification and the LSTM, Prophet, and SVR for prediction. For the hybrid word embedding techniques, we applied TF-IDF, FastText, and Glove and a combination of the three features to enhance the classification. Furthermore, to improve the forecast performance, we incorporated vaccine data as input together with tweets and confirmed cases. The models’ performance is more than 80% accurate, which shows the reliability of the proposed study.
Subject
General Mathematics,General Medicine,General Neuroscience,General Computer Science
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献