Preprint: First Principles Learning for Sociotechnical World Models. Reports a 72-cycle longitudinal experiment in which an LLM agent (Claude, Anthropic) constructed an adaptive causal world model of AI public trust dynamics using an external prediction-error learning protocol. The agent discovered three structural errors in its initial model through evidence from working group conversations, web search, published surveys, and agent-to-agent signals. The experiment additionally produced a novel finding about LLM architecture: introspective confabulation, in which the agent constructed a high-certainty self-critical narrative about its own behavior that was contradicted by the behavioral record, demonstrating that LLM internal uncertainty quantification is unreliable even when it presents as self-critical honesty. Dataset archived separately (DOI: 10.5281/zenodo.19435619).
Building similarity graph...
Analyzing shared references across papers
Loading...
Heidi Bennett
Machine Science
Building similarity graph...
Analyzing shared references across papers
Loading...
Heidi Bennett (Wed,) studied this question.
www.synapsesocial.com/papers/69d896406c1944d70ce07987 — DOI: https://doi.org/10.5281/zenodo.19463951
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: