Affiliation:
1. International Research Center for Neurointelligence (WPI‐IRCN) UTIAS, The University of Tokyo Tokyo Japan
Abstract
AbstractCreativity is defined by three key factors: novelty, feasibility and value. While many creativity tests focus primarily on novelty, they often neglect feasibility and value, thereby limiting their reflection of real‐world creativity. In this study, we employ GPT‐4, a large language model, to assess these three dimensions in a Japanese‐language Alternative Uses Test (AUT). Using a crowdsourced evaluation method, we acquire ground truth data for 30 question items and test various GPT prompt designs. Our findings show that asking for multiple responses in a single prompt, using an ‘explain first, rate later’ design, is both cost‐effective and accurate (r = .62, .59 and .33 for novelty, feasibility and value, respectively). Moreover, our method offers comparable accuracy to existing methods in assessing novelty, without the need for training data. We also evaluate additional models such as GPT‐4 Turbo, GPT‐4 Omni and Claude 3.5 Sonnet. Comparable performance across these models demonstrates the universal applicability of our prompt design. Our results contribute a straightforward platform for instant AUT evaluation and provide valuable ground truth data for future methodological research.
Funder
Ministry of Education, Culture, Sports, Science and Technology
Reference44 articles.
1. Anthropic. (2024).Model Card Claude 3 Addendum.https://www‐cdn.anthropic.com/fed9cc193a14b84131812372d8d5857f8f304c52/Model_Card_Claude_3_Addendum.pdf
2. Robust prediction of individual creative ability from brain functional connectivity