Are state-of-the-art large language models conscious, or capable of anything like consciousness? We introduce ConsciousnessBench: the first systematic benchmark designed to empirically evaluate consciousness-relevant traits in frontier language models, grounded in 5 leading scientific theories. We assess 8 advanced models via 840 self-report responses, finding not only statistically robust performance differences, but—more importantly—evidence of distinct model cognitive profiles and engagement strategies with consciousness-related constructs. Our results reveal that some models demonstrate theoretical fluency, specialization in certain cognitive tasks, or even phenomenological exploration, while others default to deflection. While we cannot deliver a definitive verdict on AI consciousness, our findings show that consciousness-related capacities—and their computational diversity—are now empirically tractable, even if not yet empirically decidable.
Building similarity graph...
Analyzing shared references across papers
Loading...
Haoran Zheng (Tue,) studied this question.
www.synapsesocial.com/papers/68e70da790569dd607ee5abe — DOI: https://doi.org/10.31234/osf.io/fqwp9_v1
Haoran Zheng
Building similarity graph...
Analyzing shared references across papers
Loading...
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: