BACKGROUND
Predictive analytics through machine learning has been extensively using across industries including eHealth and mHealth for analyzing patient’s health data, predicting diseases, enhancing the productivity of technology or devices used for providing healthcare services and so on. However, not enough studies were conducted to predict the usage of eHealth by rural patients in developing countries.
OBJECTIVE
The objective of this study is to predict rural patients’ use of eHealth through supervised machine learning algorithms and propose the best-fitted model after evaluating their performances in terms of predictive accuracy.
METHODS
Data were collected between June and July 2016 through a field survey with structured questionnaire form 292 randomly selected rural patients in a remote North-Western sub-district of Bangladesh. Four supervised machine learning algorithms namely logistic regression, boosted decision tree, support vector machine, and artificial neural network were chosen for this experiment. A ‘correlation-based feature selection’ technique was applied to include the most relevant but not redundant features into the model. A 10-fold cross-validation technique was applied to reduce bias and over-fitting of the data.
RESULTS
Logistic regression outperformed other three algorithms with 85.9% predictive accuracy, 86.4% precision, 90.5% recall, 88.1% F-score, and AUC of 91.5% followed by neural network, decision tree and support vector machine with the accuracy rate of 84.2%, 82.9 %, and 80.4% respectively.
CONCLUSIONS
The findings of this study are expected to be helpful for eHealth practitioners in selecting appropriate areas to serve and dealing with both under-capacity and over-capacity by predicting the patients’ response in advance with a certain level of accuracy and precision.