Abstract
BackgroundTuberculosis (TB) is the leading cause of infectious disease mortality worldwide. Numerous blood-based gene expression signatures have been proposed in the literature as alternative tools for diagnosing TB infection. Ongoing efforts are actively focused on developing additional signatures in other TB-related contexts. However, the generalizability of these signatures to different patient contexts is not well-characterized. There is a pressing need for a well-curated database of TB gene expression studies for the systematic assessment of existing and newly developed TB gene signatures.ResultsWe built the curatedTBData, a manually-curated database of 49 TB transcriptomic studies. This data resource is freely available through GitHub and as an R Bioconductor package that allows users to validate new and existing biomarkers without the challenges of harmonizing heterogeneous studies. We also demonstrate the use of this data resource with cross-study comparisons for 72 TB gene signatures. For the comparison of subjects with active TB from healthy controls, 19 gene signatures had weighted mean AUC of 0.90 or greater, with the highest result of 0.94. In active TB disease versus latent TB infection, 7 gene signatures had weighted mean AUC of 0.90 or greater, with a maximum of 0.93. We also explore ensembling methods for averaging predictions from multiple gene signatures to significantly improve diagnostic ability beyond any single signature.ConclusionsThe curatedTBData data package offers a comprehensive resource of curated gene expression and clinically annotated data. It could be used to identify robust new TB gene signatures, to perform comparative analysis of existing TB gene signatures, and to develop alternative gene set scoring or ensembling methods, among other things. This resource will also facilitate the development of new signatures that are generalizable across cohorts or more applicable to specific subsets of patients (e.g. with rare comorbid conditions, etc.). We demonstrated that these blood-based gene signatures could distinguish patients with distinct TB outcomes; moreover, the combination of multiple gene signatures could improve the overall predictive accuracy in differentiating these subtypes, which point out an important aspect for the translation of genomics to clinical implementation.
Publisher
Cold Spring Harbor Laboratory
Reference101 articles.
1. Who’s global tuberculosis report 2022;The Lancet Microbe,2023
2. Hayley Warsinske , Rohit Vashisht , and Purvesh Khatri . Host-response-based gene signatures for tuberculosis diagnosis: A systematic comparison of 16 signatures. PLoS medicine, 16(4), 2019.
3. Arthur VanValkenburg , Vaishnavi Kaipilyawar , Sonali Sarkar , Subitha Lakshminarayanan , Chelsie Cintron , Senbagavalli Prakash Babu , Selby Knudsen , Noyal Mariya Joseph , C Robert Horsburgh , Pranay Sinha , et al. Malnutrition leads to increased inflammation and expression of tuberculosis risk signatures in recently exposed household contacts of pulmonary tuberculosis. Frontiers in Immunology, 13:1011166, 2022.
4. Development and validation of a parsimonious tuberculosis gene signature using the digital nanostring ncounter platform;Clinical Infectious Diseases,2022
5. January Weiner 3rd, Martin OC Ota, Smitha Shankar, Adam Penn-Nicholson, Bonnie Thiel, Mzwandile Erasmus, Jeroen Maertzdorf;Four-gene pan-african blood signature predicts progression to tuberculosis. American journal of respiratory and critical care medicine,2018