Key points are not available for this paper at this time.
We study whether a large language model can reliably evaluate human creativity in constrained, innovation-like tasks. Using expert-generated creative outputs from a validated experiment with workers in cultural and creative industries, we embed ChatGPT as an evaluator and benchmark its assessments against expert human judgments obtained through the Consensual Assessment Technique. Study 1 supports AI reliability by showing that AI-based creativity evaluations exhibit internal consistency comparable to that of expert judges across repeated and independent runs, even under conservative scenarios. Replacing a human judge with an AI evaluator does not reduce inter-rater reliability across drawing, mathematical, and verbal tasks. Beyond reliability, AI evaluations display three additional features that are difficult to achieve with human-only panels: lower evaluative variability, systematically higher scores consistent with a potentially more inclusive evaluative stance, and task-independence of evaluative standards. Study 2 further supports task-independence by showing that AI evaluations are structured along fluency, flexibility, originality, and elaboration, with dimension weights that adapt to task-specific constraints. • We test AI evaluation of human creativity on outputs from a controlled experiment. • We study constrained, innovation-like creative tasks. • Replacing one human judge with AI preserves panel reliability. • AI scores are less dispersed, higher on average, and task-independent. • AI evaluation is structured by fluency, flexibility, originality, and elaboration.
Building similarity graph...
Analyzing shared references across papers
Loading...
Valerio Fedele Addis
Giuseppe Attanasi
Giovanni Di Bartolomeo
Technovation
Sapienza University of Rome
Corvinus University of Budapest
Building similarity graph...
Analyzing shared references across papers
Loading...
Addis et al. (Thu,) studied this question.
www.synapsesocial.com/papers/6a080ab3a487c87a6a40ca2d — DOI: https://doi.org/10.1016/j.technovation.2026.103571