AI-based disease category prediction model using symptoms from low-resource Ethiopian language: Afaan Oromo text-Reference-Cited by-同舟云学术

AI-based disease category prediction model using symptoms from low-resource Ethiopian language: Afaan Oromo text

Published:2024-05-16 Issue:1 Volume:14 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Dinsa Etana Fikadu,Das Mrinal,Abebe Teklu Urgessa

Abstract

AbstractAutomated disease diagnosis and prediction, powered by AI, play a crucial role in enabling medical professionals to deliver effective care to patients. While such predictive tools have been extensively explored in resource-rich languages like English, this manuscript focuses on predicting disease categories automatically from symptoms documented in the Afaan Oromo language, employing various classification algorithms. This study encompasses machine learning techniques such as support vector machines, random forests, logistic regression, and Naïve Bayes, as well as deep learning approaches including LSTM, GRU, and Bi-LSTM. Due to the unavailability of a standard corpus, we prepared three data sets with different numbers of patient symptoms arranged into 10 categories. The two feature representations, TF-IDF and word embedding, were employed. The performance of the proposed methodology has been evaluated using accuracy, recall, precision, and F1 score. The experimental results show that, among machine learning models, the SVM model using TF-IDF had the highest accuracy and F1 score of 94.7%, while the LSTM model using word2vec embedding showed an accuracy rate of 95.7% and F1 score of 96.0% from deep learning models. To enhance the optimal performance of each model, several hyper-parameter tuning settings were used. This study shows that the LSTM model verifies to be the best of all the other models over the entire dataset.

Publisher

Springer Science and Business Media LLC

Link

https://www.nature.com/articles/s41598-024-62278-7.pdf

Reference50 articles.

1. Kaur, S. et al. Medical diagnostic systems using artificial intelligence (AI) algorithms: Principles and perspectives. IEEE Access 8, 228049–228069 (2020).

2. Leaman, R., Doǧan, R. I. & Lu, Z. DNorm: Disease name normalization with pairwise learning to rank. Bioinformatics 29, 2909–2917 (2013).

3. Armstrong, N. & Hilton, P. Doing diagnosis: Whether and how clinicians use a diagnostic tool of uncertain clinical utility. Soc. Sci. Med. 120, 208–214 (2014).

4. Ball, S. A., Jaffe, A. J., Crouse-Artus, M. S., Rounsaville, B. J. & O’Malley, S. S. Multidimensional subtypes and treatment outcome in first-time DWI offenders. Addict. Behav. 25, 167–181 (2000).

5. Yang, Z. et al. Clinical assistant diagnosis for electronic medical record based on convolutional neural network. Sci. Rep. 8, 1–9 (2018).