Abstract
We used machine-learning algorithms to evaluate demographic and clinical data in an administrative data set to identify relevant predictors of mortality due to Listeria monocytogenes infection. We used the Spanish Minimum Basic Data Set at Hospitalization (MBDS-H) to estimate the impacts of several predictors on mortality. The MBDS-H is a mandatory registry of clinical discharge reports. Data were coded with International Classification of Diseases, either Ninth or Tenth Revisions, codes. Diagnoses and clinical conditions were defined using recorded data from these codes or a combination of them. We used two different statistical approaches to produce two predictive models. The first was logistic regression, a classic statistical approach that uses data science to preprocess data and measure performance. The second was a random forest algorithm, a strategy based on machine learning and feature selection. We compared the performance of the two models using predictive accuracy and the area under the curve. Between 2001 and 2016, a total of 5603 hospitalized patients were identified as having any clinical form of listeriosis. Most patients were adults (94.9%). Among all hospitalized individuals, there were 2318 women (41.4%). We recorded 301 pregnant women and 287 newborns with listeriosis. The mortality rate was 0.13 patients per 100,000 population. The performance of the model produced by logistic regression after intense preprocessing was similar to that of the model produced by the random forest algorithm. Predictive accuracy was 0.83, and the area under the receiver operating characteristic curve was 0.74 in both models. Sepsis, age, and malignancy were the most relevant features related to mortality. Our combined use of data science, preprocessing, conventional statistics, and machine learning provides insights into mortality due to Listeria-related infection. These methods are not mutually exclusive. The combined use of several methods would allow researchers to better explain results and understand data related to Listeria monocytogenes infection.
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献