BACKGROUND
Highly detailed and invasive clinical investigations are needed to stratify haematuria patients with no disease, benign disease, and malignant disease. Due to the heterogeneity in the patient population and wide range of potential causes of haematuria, possibility to indicate patient-specific biomarkers could improve and speed up diagnostic process, which is crucial for patients with suspected cancer.
OBJECTIVE
We developed a new algorithm to identify risk of bladder cancer in haematuria patients by analyzing multiple urine and serum biomarkers and identifying the most significant using complex network theory.
METHODS
We analyzed data collected in the HABIO case – control study of haematuria patients, containing 675 participants (190 females, 485 males) aged between 40 and 80 years. In the study, we used the initial analysis pipeline of our Self-Supervised Semantic Learning (3SL) framework grounded on the complex network theory to stratify participants into two groups: healthy (with no clear cause of haematuria) or sick (with bladder cancer, infection etc.). We compared our model sensitivity and specificity with logistic regression and binary decision tree outcomes. To assess model performance, we used balanced accuracy to account for imbalance between the number of healthy and sick participants in the dataset. Additionally, to assess how linearly separable the biomarkers were, we applied k-means clustering.
RESULTS
Our modelling outperformed logistic regression and binary decision trees obtaining balance accuracies of 0.693 (females) and 0.715 (males) vs 0.621 (females) and 0.533 (males) for logistic regression and 0.570 (females) and 0.597 (males) for binary trees. K-means clustering showed that the distribution of the biomarkers did not match clear macro-patterns. For the sick population (both genders) the most significant biomarkers were previously associated with infectious diseases and inflammation (thrombomodulin, sTNFRII and osmolarity) or bladder cancer (IL-8, TGF-β). Additionally, CXCL16, midkine, clusterin, CEA, 8-OHdG were previously described in the literature as a potential biomarker for urinary tract cancers.
CONCLUSIONS
In the study we applied a new algorithm to improve diagnosis of haematuria in study participants. The algorithm performs better than currently widely applied methods (logistic regression, binary trees, k-means clustering). Additionally, applying 3SL algorithm we identified biomarkers most relevant for the specific group of patients and dependencies between those biomarkers. We hope that our results can guide further research and provide new personalised diagnostic tools directly tailored to individual patients' needs.
CLINICALTRIAL
Ethical approval was obtained from the Office of Research Ethics Committee Northern Ireland (11/NI/0164).