Abstract
AbstractThe assertion that artificial intelligence (AI) cannot grasp the subtleties and complexities of human emotions has been a long-standing debate in AI research. However, recent advancements, particularly in large language models (LLMs), have begun challenging this notion by demonstrating an increased capacity for understanding and generating human-like text, a significant step toward artificial empathy and emotional intelligence. In this study, we evaluated the empathy levels and the identification and description of emotions by three current language models Bard, GPT 3.5, and GPT 4. We used the Toronto Alexithymia Scale (TAS-20) and the 60-question Empathy Quotient (EQ-60) questions to prompt these models and score the responses. The models’ performance was contrasted with human benchmarks of neurotypical controls and clinical populations. We found that the less sophisticated models (Bard and GPT 3.5) performed inferiorly on TAS-20, aligning close to alexithymia, a condition with significant difficulties in recognizing, expressing, and describing one’s or others’ experienced emotions. However, the newest GPT 4 uniquely achieved performance close to the human level, with two sub-categories surpassing humans. Interestingly, there was an intriguing inverse relationship between the model’s success on aptitude tests and performance on the EQ-60, with Bard surpassing the human benchmark significantly but not GPT 3.5 and GPT 4. These results demonstrated that LLMs trained on vast amounts of text data, when benchmarked on their capacity for human-level empathy and emotional intelligence, are comparable in their ability to identify and describe emotions and may be able to surpass humans in their capacity for emotional intelligence. These novel insights into the emotional intelligence capabilities of foundational models provide alignment research and a measurement of the progress and limitations towards aligning with human values. While the journey towards fully empathetic AI is still ongoing, these advancements suggest that it may not be as far-fetched as once believed.
Publisher
Cold Spring Harbor Laboratory
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献