Development and validation of algorithms to predict left ventricular ejection fraction class from healthcare claims data


Logeart Damien1,Doublet Maxime2,Gouysse Margaux2,Damy Thibaud3,Isnard Richard4,Roubille François5


1. Department of Cardiology Paris Cité University, AP‐HP Hôpital Lariboisière, Inserm U942 2 rue Ambroise Paré Paris France

2. Clinityx Boulogne‐Billancourt France

3. Department of Cardiology and French National Reference Centre for Cardiac Amyloidosis Hôpitaux Universitaires Henri‐Mondor AP‐HP, IMRB, Inserm, Université Paris‐Est Créteil Créteil France

4. Hôpital Pitié‐Salpétrière, AP‐HP Paris France

5. Department of Cardiology INI‐CRT PhyMedExp Inserm CNRS, CHU de Montpellier, Université de Montpellier Montpellier France


AbstractAimsThe use of large medical or healthcare claims databases is very useful for population‐based studies on the burden of heart failure (HF). Clinical characteristics and management of HF patients differ according to categories of left ventricular ejection fraction (LVEF), but this information is often missing in such databases. We aimed to develop and validate algorithms to identify LVEF in healthcare databases where the information is lacking.Methods and resultsAlgorithms were built by machine learning with a random forest approach. Algorithms were trained and reinforced using the French national claims database [Système National des Données de Santé (SNDS)] and a French HF registry. Variables were age, gender, and comorbidities, which could be identified by medico‐administrative code‐based proxies, Anatomical Therapeutic Chemical codes for drug delivery, International Classification of Diseases (Tenth Revision) coding for hospitalizations, and administrative codes for any other type of reimbursed care. The algorithms were validated by cross‐validation and against a subset of the SNDS that includes LVEF information. The areas under the receiver operating characteristic curve were 0.84 for the algorithm identifying LVEF ≤ 40% and 0.79 for the algorithms identifying LVEF < 50% and ≥50%. For LVEF ≤ 40%, the reinforced algorithm identified 50% of patients in the validation dataset with a positive predictive value of 0.88 and a specificity of 0.96. The most important predictive variables were delivery of HF medication, sex, age, hospitalization, and testing for natriuretic peptides with different orders of positive or negative importance according to the LVEF category.ConclusionsThe algorithms identify reduced or preserved LVEF in HF patients within a nationwide healthcare claims database with high positive predictive value and low rates of false positives.


Amgen Foundation



Boehringer Ingelheim




Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献







Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3