Author:
Balyan Renu,Crossley Scott A.,Brown William,Karter Andrew J.,McNamara Danielle S.,Liu Jennifer Y.,Lyles Courtney R.,Schillinger Dean
Abstract
AbstractLimited health literacy can be a barrier to healthcare delivery, but widespread classification of patient health literacy is challenging. We applied natural language processing and machine learning on a large sample of 283,216 secure messages sent from 6,941 patients to their clinicians for this study to develop and validate literacy profiles as indicators of patients’ health literacy. All patients were participants in Kaiser Permanente Northern California’s DISTANCE Study. We created three literacy profiles, comparing performance of each literacy profile against a gold standard of patient self-report. We also analyzed associations between the literacy profiles and patient demographics, health outcomes and healthcare utilization. T-tests were used for numeric data such as A1C, Charlson comorbidity index and healthcare utilization rates, and chi-square tests for categorical data such as sex, race, continuous medication gaps and severe hypoglycemia. Literacy profiles varied in their test characteristics, with C-statistics ranging from 0.61-0.74. Relationships between literacy profiles and health outcomes revealed patterns consistent with previous health literacy research: patients identified via literacy profiles as having limited health literacy were older and more likely minority; had poorer medication adherence and glycemic control; and higher rates of hypoglycemia, comorbidities and healthcare utilization. This research represents the first successful attempt to use natural language processing and machine learning to measure health literacy. Literacy profiles offer an automated and economical way to identify patients with limited health literacy and a greater vulnerability to poor health outcomes.
Publisher
Cold Spring Harbor Laboratory