High performance Legionella pneumophila source attribution using genomics-based machine learning classification

Author:

Buultjens Andrew H.12ORCID,Vandelannoote Koen3,Mercoulia Karolina4,Ballard Susan4,Sloggett Clare4,Howden Benjamin P.245,Seemann Torsten4,Stinear Timothy P.12ORCID

Affiliation:

1. Department of Microbiology and Immunology, Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, Victoria, Australia

2. Center for Pathogen Genomics, University of Melbourne, Melbourne, Victoria, Australia

3. Bacterial Phylogenomics Group, Institut Pasteur du Cambodge, Phnom Penh, Cambodia

4. Department of Microbiology and Immunology, Microbiology Diagnostic Unit, Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, Victoria, Australia

5. Department of Infectious Diseases, Austin Health, Heidelberg, Victoria, Australia

Abstract

ABSTRACT Fundamental to effective Legionnaires’ disease outbreak control is the ability to rapidly identify the environmental source(s) of the causative agent, Legionella pneumophila . Genomics has revolutionized pathogen surveillance, but L. pneumophila has a complex ecology and population structure that can limit source inference based on standard core genome phylogenetics. Here, we present a powerful machine learning approach that assigns the geographical source of Legionnaires’ disease outbreaks more accurately than current core genome comparisons. Models were developed upon 534 L . pneumophila genome sequences, including 149 genomes linked to 20 previously reported Legionnaires’ disease outbreaks through detailed case investigations. Our classification models were developed in a cross-validation framework using only environmental L. pneumophila genomes. Assignments of clinical isolate geographic origins demonstrated high predictive sensitivity and specificity of the models, with no false positives or false negatives for 13 out of 20 outbreak groups, despite the presence of within-outbreak polyclonal population structure. Analysis of the same 534-genome panel with a conventional phylogenomic tree and a core genome multi-locus sequence type allelic distance-based classification approach revealed that our machine learning method had the highest overall classification performance—agreement with epidemiological information. Our multivariate statistical learning approach maximizes the use of genomic variation data and is thus well-suited for supporting Legionnaires’ disease outbreak investigations. IMPORTANCE Identifying the sources of Legionnaires’ disease outbreaks is crucial for effective control. Current genomic methods, while useful, often fall short due to the complex ecology and population structure of Legionella pneumophila , the causative agent. Our study introduces a high-performing machine learning approach for more accurate geographical source attribution of Legionnaires’ disease outbreaks. Developed using cross-validation on environmental L. pneumophila genomes, our models demonstrate excellent predictive sensitivity and specificity. Importantly, this new approach outperforms traditional methods like phylogenomic trees and core genome multi-locus sequence typing, proving more efficient at leveraging genomic variation data to infer outbreak sources. Our machine learning algorithms, harnessing both core and accessory genomic variation, offer significant promise in public health settings. By enabling rapid and precise source identification in Legionnaires’ disease outbreaks, such approaches have the potential to expedite intervention efforts and curtail disease transmission.

Funder

DHAC | National Health and Medical Research Council

Publisher

American Society for Microbiology

Subject

Ecology,Applied Microbiology and Biotechnology,Food Science,Biotechnology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3