Automated Annotation of Disease Subtypes-Reference-Cited by-同舟云学术

Automated Annotation of Disease Subtypes

Published:2023-09-25 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Ofer Dan^ORCID,Linial Michal^ORCID

Abstract

AbstractBackgroundDistinguishing diseases into distinct subtypes is crucial for study, effective treatment, and the discovery of potential cures. The Open Targets Platform integrates biomedical, genetic, and biochemical datasets with the goal of empowering disease ontologies and gene targets.However, many disease annotations remain incomplete, necessitating laborious expert medical input. This is particularly painful for rare and orphan diseases, where resources are limited.ResultsWe present a machine learning approach to identifying diseases with potential subtypes, using the approximately 23,000 diseases documented in Open Targets. We derive and describe novel features for predicting diseases with subtypes, using direct evidence. Machine learning models were applied to analyze feature importance and evaluate predictive performance for discovering known subtypes. Our model achieves a high (89.1%) ROCAUC. We integrated pre-trained deep learning language models and showed their benefits. Furthermore, we identify 515 disease candidates predicted to possess previously unannotated subtypes.ConclusionsOur models can partition diseases into distinct subtypes. This methodology enables a robust, scalable approach for improving knowledge-based annotations and a comprehensive assessment of disease ontology tiers. Our candidates are attractive targets for further study and personalized medicine, potentially aiding in the unveiling of new therapeutic indications for sought-after targets.

Publisher

Cold Spring Harbor Laboratory

Reference57 articles.

1. Parkinson’s Disease Subtyping Using Clinical Features and Biomarkers: Literature Review and Preliminary Study of Subtype Clustering

2. Developing automated methods for disease subtyping in UK Biobank: an exemplar study on stroke

3. Subtyping: What It is and Its Role in Precision Medicine

4. World Health Organization , “ICD-10 : international statistical classification of diseases and related health problems : tenth revision,” World Health Organization, 2004. Accessed: Aug. 21, 2023. [Online]. Available: https://apps.who.int/iris/handle/10665/42980

5. Type 2 diabetes genetic loci informed by multi-trait associations point to disease mechanisms and subtypes: A soft clustering analysis