Improving patient clustering by incorporating structured label relationships in similarity measures

Author:

Lambert JudithORCID,Leutenegger Anne-LouiseORCID,Baudot AnaïsORCID,Jannot Anne-SophieORCID

Abstract

AbstractContextPatient stratification is the cornerstone of numerous health studies, serving to enhance medicine efficacy estimation and facilitate patient matching. To stratify patients, similarity measured between patients can be computed from medical health records databases, such as medico-administrative databases. Importantly, the variables included in medico-administrative databases can be associated with labels, which can be organized in ontologies or other classification systems. However, to the best of our knowledge, the relevance of considering such label classification in the computation of patient similarity measures has been poorly studied.ObjectiveWe propose and evaluate several weighted versions of the Cosine similarity that consider structured label relationships to compute patient similarities from a medico-administrative database.Material and MethodsAs a use case, we analyze medicine reimbursements contained in theÉchantillon Généraliste des Bénéficiaires, a French medico-administrative database. We compute the standard Cosine similarity between patients based on their medicine reimbursement. In addition, we computed a weighted Cosine similarity measure that includes variable frequencies and two weighted Cosine similarity measures that consider label relationships. We construct patient networks from each similarity measure and identify clusters of patients. We evaluate the performance of the different similarity measures with enrichment tests using information on chronic diseases.ResultsThe similarity measures that include label relationships perform better to identify similar patients. Indeed, using these weighted measures, we identify distinct patient clusters with a higher number of chronic disease enrichments as compared to the other measures. Importantly, the enrichment tests provide clinically interpretable insights into these patient clusters.ConclusionConsidering label relationships when computing patient similarities improves stratification of patients regarding their health status.

Publisher

Cold Spring Harbor Laboratory

Reference27 articles.

1. Next generation phenotyping using narrative reports in a rare disease clinical data warehouse;Orphanet journal of rare diseases,2018

2. Identification of type 2 diabetes subgroups through topological analysis of patient similarity

3. Novel subgroups of adult-onset diabetes and their association with outcomes: a data-driven cluster analysis of six variables. The Lancet Diabetes &;Endocrinology,2018

4. International classification of diseases (ICD);KO KNOWLEDGE ORGANIZATION,2023

5. SNOMED-CT: The advanced terminology and coding system for eHealth;Studies in health technology and informatics,2006

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3