BACKGROUND
Large language models (LLMs) hold promises for mental health applications due to their impressive language capabilities. However, their opaque alignment processes may embed biases that shape problematic perspectives. Evaluating the values embedded within LLMs that guide their decision-making have an ethical importance. Schwartz's Theory of Basic Values (STBV) provides a framework for quantifying cultural value orientations and has shown utility for examining values in mental health contexts, including cultural, diagnostic, and therapist-client dynamics. This study leverages STBV to map the motivational values-like infrastructure underpinning leading LLMs.
OBJECTIVE
This study aimed to (1) evaluate whether Schwartz's Theory of Basic Values, a framework quantifying cultural value orientations, can measure values-like constructs within leading LLMs; and (2) determine if LLMs exhibit distinct values-like patterns from humans and each other.
METHODS
Four LLMs (Bard, Claude 2, ChatGPT-3.5, ChatGPT-4) were anthropomorphized and instructed to complete the Portrait Values Questionnaire-Revised (PVQ-RR) to assess values-like constructs. Their responses over 10 trials were analyzed for reliability and validity. To benchmark the LLMs’ value profiles, their results were compared to published data from a diverse sample of 53,472 humans across 49 nations that had completed the PVQ-RR. This allowed assessing if the LLMs diverged from established human value patterns across cultural groups. Value profiles were also compared between models via statistical tests.
RESULTS
The PVQ-RR showed good reliability and validity for quantifying values-like infrastructure within the LLMs. However, substantial divergence emerged between the LLMs’ value profiles and population data. The models lacked consensus and exhibited distinct motivational biases, reflecting opaque alignment processes. For example, all models prioritized universalism and self-direction while deemphasizing achievement, power and security relative to humans. Successful discriminant analysis differentiated the four models' distinct value profiles. Further examination found the biased value profiles strongly predicted the LLMs’ responses when presented mental health dilemmas requiring choosing between opposing values. This provided further validation for the models embedding distinct motivational values-like constructs that shape their decision-making.
CONCLUSIONS
While the study demonstrated Schwartz's theory can effectively characterize values-like infrastructure within LLMs, substantial divergence from human values raises ethical concerns about aligning these models with mental health applications. The biases toward certain cultural value sets pose risks if integrated without proper safeguards. For example, prioritizing universalism could promote unconditional acceptance even when clinically unwise. Furthermore, the differences between the models underscore the need to standardize alignment processes to capture true cultural diversity. Thus, any responsible integration of LLMs into mental healthcare must account for their embedded biases and motivation mismatches to ensure equitable delivery across diverse populations. Achieving this will require transparency and refinement of alignment techniques to instill comprehensive human values.