Machine learning algorithms using national registry data to predict loss to follow-up during tuberculosis treatment
-
Published:2024-05-23
Issue:1
Volume:24
Page:
-
ISSN:1471-2458
-
Container-title:BMC Public Health
-
language:en
-
Short-container-title:BMC Public Health
Author:
Rodrigues Moreno M. S.,Barreto-Duarte Beatriz,Vinhaes Caian L.,Araújo-Pereira Mariana,Fukutani Eduardo R.,Bergamaschi Keityane Bone,Kristki Afrânio,Cordeiro-Santos Marcelo,Rolla Valeria C.,Sterling Timothy R.,Queiroz Artur T. L.,Andrade Bruno B.
Abstract
Abstract
Background
Identifying patients at increased risk of loss to follow-up (LTFU) is key to developing strategies to optimize the clinical management of tuberculosis (TB). The use of national registry data in prediction models may be a useful tool to inform healthcare workers about risk of LTFU. Here we developed a score to predict the risk of LTFU during anti-TB treatment (ATT) in a nationwide cohort of cases using clinical data reported to the Brazilian Notifiable Disease Information System (SINAN).
Methods
We performed a retrospective study of all TB cases reported to SINAN between 2015 and 2022; excluding children (< 18 years-old), vulnerable groups or drug-resistant TB. For the score, data before treatment initiation were used. We trained and internally validated three different prediction scoring systems, based on Logistic Regression, Random Forest, and Light Gradient Boosting. Before applying our models we splitted our data into training (~ 80% data) and test (~ 20%) sets, and then compared the model metrics using the test data set.
Results
Of the 243,726 cases included, 41,373 experienced LTFU whereas 202,353 were successfully treated. The groups were different with regards to several clinical and sociodemographic characteristics. The directly observed treatment (DOT) was unbalanced between the groups with lower prevalence in those who were LTFU. Three models were developed to predict LTFU using 8 features (prior TB, drug use, age, sex, HIV infection and schooling level) with different score composition approaches. Those prediction scoring systems exhibited an area under the curve (AUC) ranging between 0.71 and 0.72. The Light Gradient Boosting technique resulted in the best prediction performance, weighting specificity and sensitivity. A user-friendly web calculator app was developed (https://tbprediction.herokuapp.com/) to facilitate implementation.
Conclusions
Our nationwide risk score predicts the risk of LTFU during ATT in Brazilian adults prior to treatment commencement utilizing schooling level, sex, age, prior TB status, and substance use (drug, alcohol, and/or tobacco). This is a potential tool to assist in decision-making strategies to guide resource allocation, DOT indications, and improve TB treatment adherence.
Funder
Fundação Oswaldo Cruz
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
National Institute of Allergy and Infectious Diseases
Ministério da Saúde
Conselho Nacional de Desenvolvimento Científico e Tecnológico
Publisher
Springer Science and Business Media LLC
Reference32 articles.
1. WHO. Global tuberculosis report 2023 [Internet]. [cited 2023 Nov 28]. https://www.who.int/publications-detail-redirect/9789240083851.
2. Rapid communication. key changes to the treatment of drug-resistant tuberculosis [Internet]. [cited 2023 Dec 4]. https://www.who.int/publications-detail-redirect/WHO-UCN-TB-2022-2.
3. WHO consolidated guidelines on tuberculosis. module 4: treatment: drug-susceptible tuberculosis treatment [Internet]. [cited 2023 Dec 4]. https://www.who.int/publications-detail-redirect/9789240048126.
4. The World Bank Group. The World Bank In Brazil [Internet]. World Bank. [cited 2023 Dec 4]. https://www.worldbank.org/en/country/brazil/overview.
5. Campos T. Manual SINAN – Normas e Rotinas 2a edição – Portal da Vigilância em Saúde [Internet]. 2018 [cited 2023 Nov 28]. http://vigilancia.saude.mg.gov.br/index.php/download/manual-sinan-normas-e-rotinas-2a-edicao/.