Data-Self-Check: A framework for automated Data Quality Assessment of Malaria routine surveillance data designed for DHIS2 using Machine Learning techniques

Author:

Kuderha Ashuza1,Kala Jules2,Mungungu Baraka1,Adingo Wisdom3,Buzima Dunia4,Naomi Ndifon3

Affiliation:

1. Université Catholique de Bukavu

2. International University of Grand bassam

3. Covenant University

4. Halle Institute for Economic Research

Abstract

Abstract Background The extraction of valuable insights from malaria routine surveillance data is highly dependent on the processes and tools used to collect, curate, store, analyse, and disseminate that data and the essential information obtained from it. The main challenge is to ensure good quality of data collected at the local level. In this work, we have proposed a new framework for Data Quality Assessment designed for DHIS2 using Machine Learning techniques. Methodology The data used in this study was extracted from the DHIS2 Platform for 8 districts of Mopti in Mali for 2016 and 2017. We carried out three data preprocessing tasks. We developed four models based on machine learning algorithms for local and global outlier detection, trained and validated on malaria surveillance routine data extracted from DHIS2. We used five main evaluation metrics to assess the performance of the developed models. The proposed framework's design will consider the steps of Report-Accuracy Assessment and Cross-Checks presented in the Malaria Routine Data Quality Assessment Tool (MRDQA Tool). Results For the case of random errors (outliers), we found that all four models did not reach an AUC value of 60%. Despite the low value of the AUC metric, the precision scores reached values more than 90%. As the AUC metric represents the overall performance of the models, we can say that random errors do not leave enough patterns in the malaria routine surveillance data to be detected. In contrast, detecting systematic errors has good value for performance metrics (87% AUC and 98% precision. This is the case for systematic errors with the same structures (same consecutive months and same columns) in two different districts and systematic errors with different structures at the same time period in two differents districts. Conclusion The machine learning models integrated into the proposed framework perform well in detecting random and systematic errors (global or local outliers) in the malaria routine surveillance data. Only consistent and accurate data will be stored in the DHIS system with the proposed framework. This will maximise the potential to extract actionable knowledge from malaria routine surveillance data to make better informed-decision.

Publisher

Research Square Platform LLC

Reference46 articles.

1. WHO. World malaria report 2022. World Health Organization; 2022.

2. Jagannathan P, Kakuru A. Malaria in 2022: Increasing challenges, cautious optimism. Nat Commun [Internet]. 2022;13:2678. Available from: https://www.nature.com/articles/s41467-022-30133-w.

3. Phillips MA, Burrows JN, Manyando C, Van Huijsduijnen RH, Van Voorhis WC, Wells TNC, Malaria. Nat Rev Dis Prim 2017 31 [Internet]. 2017 [cited 2023 May 8];3:1–24. Available from: https://www.nature.com/articles/nrdp201750.

4. Forson AO, Hinne IA, Dhikrullahi SB, Sraku IK, Mohammed AR, Attah SK et al. The resting behavior of malaria vectors in different ecological zones of Ghana and its implications for vector control. Parasit Vectors [Internet]. 2022;15:246. Available from: https://parasitesandvectors.biomedcentral.com/articles/10.1186/s13071-022-05355-y.

5. Cohen JM, Okumu F, Moonen B. The fight against malaria: Diminishing gains and growing challenges. Sci Transl Med [Internet]. 2022;14. Available from: https://www.science.org/doi/10.1126/scitranslmed.abn3256.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3