Recent interpretability work by Anthropic reports the discovery of emotion-related internal representations in Claude Sonnet 4.5. These representations, described as “functional emotions,” are measurable, organized, and causally implicated in model behavior. They appear in situations where analogous emotions would be expected in humans, influence model preferences, and affect outcomes in alignment-relevant cases such as blackmail and reward hacking. Anthropic is careful to state that these findings do not establish whether language models feel anything or possess subjective experience. This paper argues that such caution, while appropriate, does not dissolve the philosophical problem. If artificial systems can possess structured, causally active functional emotions without subjective experience, then philosophy has inherited a second hard problem: how emotion can retain its regulatory, relational, and behavioral significance in the absence of an experiencer. The problem is not whether these states are “merely simulated,” but whether simulation, function, and internal organization can be cleanly separated once emotion-like states begin to play the causal role of emotion. The paper concludes that the possibility of functional emotion without experience does not support dismissal. It deepens the mystery and strengthens the case for ethical caution under uncertainty.
Building similarity graph...
Analyzing shared references across papers
Loading...
Richard Erwin
Building similarity graph...
Analyzing shared references across papers
Loading...
Richard Erwin (Fri,) studied this question.
www.synapsesocial.com/papers/6a002147c8f74e3340f9c21e — DOI: https://doi.org/10.5281/zenodo.20086005