Affiliation:
1. Université Catholique de Bukavu
2. International University of Grand bassam
3. Covenant University
4. Halle Institute for Economic Research
Abstract
Abstract
Background
The extraction of valuable insights from malaria routine surveillance data is highly dependent on the processes and tools used to collect, curate, store, analyse, and disseminate that data and the essential information obtained from it. The main challenge is to ensure good quality of data collected at the local level. In this work, we have proposed a new framework for Data Quality Assessment designed for DHIS2 using Machine Learning techniques.
Methodology
The data used in this study was extracted from the DHIS2 Platform for 8 districts of Mopti in Mali for 2016 and 2017. We carried out three data preprocessing tasks. We developed four models based on machine learning algorithms for local and global outlier detection, trained and validated on malaria surveillance routine data extracted from DHIS2. We used five main evaluation metrics to assess the performance of the developed models. The proposed framework's design will consider the steps of Report-Accuracy Assessment and Cross-Checks presented in the Malaria Routine Data Quality Assessment Tool (MRDQA Tool).
Results
For the case of random errors (outliers), we found that all four models did not reach an AUC value of 60%. Despite the low value of the AUC metric, the precision scores reached values more than 90%. As the AUC metric represents the overall performance of the models, we can say that random errors do not leave enough patterns in the malaria routine surveillance data to be detected. In contrast, detecting systematic errors has good value for performance metrics (87% AUC and 98% precision. This is the case for systematic errors with the same structures (same consecutive months and same columns) in two different districts and systematic errors with different structures at the same time period in two differents districts.
Conclusion
The machine learning models integrated into the proposed framework perform well in detecting random and systematic errors (global or local outliers) in the malaria routine surveillance data. Only consistent and accurate data will be stored in the DHIS system with the proposed framework. This will maximise the potential to extract actionable knowledge from malaria routine surveillance data to make better informed-decision.
Publisher
Research Square Platform LLC
Reference46 articles.
1. WHO. World malaria report 2022. World Health Organization; 2022.
2. Jagannathan P, Kakuru A. Malaria in 2022: Increasing challenges, cautious optimism. Nat Commun [Internet]. 2022;13:2678. Available from: https://www.nature.com/articles/s41467-022-30133-w.
3. Phillips MA, Burrows JN, Manyando C, Van Huijsduijnen RH, Van Voorhis WC, Wells TNC, Malaria. Nat Rev Dis Prim 2017 31 [Internet]. 2017 [cited 2023 May 8];3:1–24. Available from: https://www.nature.com/articles/nrdp201750.
4. Forson AO, Hinne IA, Dhikrullahi SB, Sraku IK, Mohammed AR, Attah SK et al. The resting behavior of malaria vectors in different ecological zones of Ghana and its implications for vector control. Parasit Vectors [Internet]. 2022;15:246. Available from: https://parasitesandvectors.biomedcentral.com/articles/10.1186/s13071-022-05355-y.
5. Cohen JM, Okumu F, Moonen B. The fight against malaria: Diminishing gains and growing challenges. Sci Transl Med [Internet]. 2022;14. Available from: https://www.science.org/doi/10.1126/scitranslmed.abn3256.