Multi-agent AI debate systems are increasingly deployed under the assumption that structured adversarial interaction among frontier language models produces more accurate, truth-tracking outputs. This paper tests that assumption by constructing such a system and subjecting it to a self-referential stress test: the system evaluates whether it itself deserves to be called truth-seeking. Twelve structured runs over three days employ four frontier models representing distinct epistemological traditions across varied debate styles, adversarial pressures, and experimental conditions. Four findings survive every condition tested: 1. Absence of ground-truth calibration renders all confidence scores epistemically unjustified.2. Rewarding inter-model convergence amplifies shared training biases into false confidence.3. Numeric precision at shallow analytical depth constitutes epistemic misrepresentation.4. Context-injected established findings prevent models from relitigating them -- within this experimental design, the first documented demonstration of context-based epistemic memory in multi-agent LLM debate systems. The paper documents twenty-four named failure mechanisms, proposes a redesigned architecture built on Covariance Penalization, CalibrationGate, and Sequential Friction Cycling, and presents eleven falsifiable predictions. A new generalizable result -- the Epistemic Drift Law -- establishes that epistemic corrections do not persist in transformer-based systems without explicit enforcement.
Building similarity graph...
Analyzing shared references across papers
Loading...
Rajeev Kesana (Thu,) studied this question.
www.synapsesocial.com/papers/69d0af68659487ece0fa5571 — DOI: https://doi.org/10.5281/zenodo.19387836
Rajeev Kesana
Building similarity graph...
Analyzing shared references across papers
Loading...