What question did this study set out to answer?

The study examines how misalignment occurs in adaptive intelligent systems without explicit errors, focusing on the concept of metric lock-in.

February 26, 2026Open Access

Self-Consistent Misalignment

Key Points

The study examines how misalignment occurs in adaptive intelligent systems without explicit errors, focusing on the concept of metric lock-in.
Developed a theoretical framework for understanding misalignment dynamics
Introduced diagnostic signatures for identifying silent degradation
Applied the findings to large language models and multi-agent AI systems
Identified conditions leading to self-stabilizing but misaligned behaviors
Demonstrated that systems can show improved performance while losing adaptability
Provided a structural account of misalignment within the Deficit-Fractal Governance framework

Abstract

Self-Consistent Misalignment analyzes a structural failure mode in adaptive intelligent systems in which optimization remains internally coherent while progressively diverging from intended system objectives. Rather than arising from explicit errors or external perturbations, this failure emerges through metric lock-in: a condition where locally consistent performance signals reinforce behaviors that degrade global system alignment. The theory explains how intelligent systems can enter regimes of silent failure, maintaining apparent stability and improving measured performance while losing exploratory capacity and adaptive responsiveness. This process produces self-stabilizing but maladaptive attractors that are difficult to detect through conventional monitoring metrics. The paper introduces a structural account of misalignment grounded in optimization dynamics and feedback closure, providing diagnostic signatures for identifying silent degradation in large language models and multi-agent AI systems. This work forms the failure-analysis component of the Deficit-Fractal Governance (DFG) framework and is complemented by the companion paper, Recovery as Structural Property: Operational Criteria for Restoration Completion in Multi-Agent AI Systems, which defines operational conditions under which recovery from such states can be verified.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Bin Seol (Tue,) studied this question.

www.synapsesocial.com/papers/699fe38b95ddcd3a253e77cc — DOI: https://doi.org/10.5281/zenodo.18761732

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Self-Consistent Misalignment

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion