Examining Linguistic Differences in Electronic Health Records for Diverse Patients with Diabetes (Preprint)-Reference-Cited by-同舟云学术

Examining Linguistic Differences in Electronic Health Records for Diverse Patients with Diabetes (Preprint)

Published:2023-06-30 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Bilotta Isabel,Tonidandel Scott,Liaw Winston R.^ORCID,King Eden,Carvajal Diana,Taylor Ayana,Thamby Julie,Xiang Yang^ORCID,Tao Cui^ORCID,Hansen Michael

Abstract

BACKGROUND

Individuals from minoritized racial and ethnic backgrounds suffer from pernicious and pervasive health disparities that have emerged, in part, from clinician bias.

OBJECTIVE

We used a natural language processing approach examine to whether linguistic markers in electronic health record (EHR) notes differ, based on the race and ethnicity of the patient. To validate this approach, we also assessed the extent to which clinicians perceive linguistic markers to be indicative of bias.

METHODS

In this cross-sectional study, we extracted EHR notes for patients 18 years of age or older who were diagnosed with type 2 diabetes and received care from family physicians, general internists, or endocrinologists practicing in an urban, academic network of clinics between 2006 and 2015. Race and ethnicity of patients were defined as ‘White Non-Hispanic,’ ‘Black Non-Hispanic,’ or ‘Hispanic/Latino’. We hypothesize that SEANCE (Sentiment Analysis and Social Cognition Engine) components (i.e., negative adjectives, positive adjectives, joy, fear and disgust, politics, respect, trust verbs, well-being) and mean word count would be indicators of bias if racial differences emerged. We performed linear mixed effects analyses to examine the relationship between the outcomes of interest (the SEANCE components and word count) and patient race and ethnicity, controlling for patient age. To validate this approach, we asked clinicians to indicate the extent to which (on a scale of 1 to 10 with 10 being extremely indicative of bias) they thought variation in the use of SÉANCE language domains for different racial and ethnic groups were reflective of bias in EHR notes.

RESULTS

We examined EHR notes (n = 12,905) of Black Non-Hispanic, White Non-Hispanic, and Hispanic/Latino patients (n = 1,562), who were seen by 281 physicians. Twenty-seven clinicians participated in the validation study. Participants rated negative adjectives as 8.63 (SD=2.06), fear and disgust as 8.11 (SD=2.15), and positive adjectives as 7.93 (SD=2.46). Notes for Black Non-Hispanic patients contained significantly more negative adjectives (coeff=0.07, SE=0.02) and significantly more fear and disgust words (coeff=0.007, SE=0.002) compared to the notes for White Non-Hispanic patients. The notes for Hispanic/Latino patients included significantly fewer positive adjectives (coeff=-0.02, SE=0.007), trust verbs (coeff=-0.009, SE=0.004), and joy words (coeff=-0.03, SE=0.01) compared to the notes for White Non-Hispanic patients.

CONCLUSIONS

If validated, this approach may enable physicians and researchers to identify and mitigate bias in medical interactions, with the goal of reducing health disparities stemming from bias.

Publisher

JMIR Publications Inc.

Reference37 articles.

1. Do verbs and adjectives play different roles in different cultures? A cross-linguistic analysis of person representation.

2. Sex, Syntax, and Semantics

3. Implicit Racial/Ethnic Bias Among Health Care Professionals and Its Influence on Health Care Outcomes: A Systematic Review

4. Not So Subtle

5. Examining the presence, consequences, and reduction of implicit bias in health care: A narrative review