Author:
Cockburn Neil,Hammond Ben,Gani Illin,Cusworth Samuel,Acharya Aditya,Gokhale Krishna,Thayakaran Rasiah,Crowe Francesca,Minhas Sonica,Smith William Parry,Taylor Beck,Nirantharakumar Krishnarajah,Chandan Joht Singh
Abstract
Abstract
Motivation
Data is increasingly used for improvement and research in public health, especially administrative data such as that collected in electronic health records. Patients enter and exit these typically open-cohort datasets non-uniformly; this can render simple questions about incidence and prevalence time-consuming and with unnecessary variation between analyses. We therefore developed methods to automate analysis of incidence and prevalence in open cohort datasets, to improve transparency, productivity and reproducibility of analyses.
Implementation
We provide both a code-free set of rules for incidence and prevalence that can be applied to any open cohort, and a python Command Line Interface implementation of these rules requiring python 3.9 or later.
General features
The Command Line Interface is used to calculate incidence and point prevalence time series from open cohort data. The ruleset can be used in developing other implementations or can be rearranged to form other analytical questions such as period prevalence.
Availability
The command line interface is freely available from https://github.com/THINKINGGroup/analogy_publication.
Publisher
Springer Science and Business Media LLC
Reference24 articles.
1. Unim B, Mattei E, Carle F, Tolonen H, Bernal-Delgado E, Achterberg P, et al. Health data collection methods and procedures across EU member states: findings from the InfAct Joint Action on health information. Arch Public Health. 2022;80(1):17.
2. National Academies of Sciences, Engineering, and Medicine. Open Science by Design: Realizing a Vision for 21st Century Research. The National Academies Press; 2018. [cited 2023 April 12]. https://nap.nationalacademies.org/catalog/25116/open-science-by-design-realizing-a-vision-for-21st-century.
3. Nguyen L, Bellucci E, Nguyen LT. Electronic health records implementation: An evaluation of information system impact and contingency factors. Int J Med Inform. 2014;83:779–96. https://doi.org/10.1016/J.IJMEDINF.2014.06.011.
4. Dash S, Shakyawar SK, Sharma M, Kaushik S. Big data in healthcare: management, analysis and future prospects. J Big Data. 2019;6(1):1–25.
5. Gokhale KM, Chandan JS, Toulis K, Gkoutos G, Tino P, Nirantharakumar K. Data extraction for epidemiological research (DExtER): a novel tool for automated clinical epidemiology studies. Eur J Epidemiol. 2021;36:165–78. https://doi.org/10.1007/S10654-020-00677-6/TABLES/6.