Background: Traditional philosophy of science assumes that scientific conclusions are logically derived from argumentation. If true, substantial changes in argumentative structure should lead to corresponding changes in conclusions. The Paradox: We observe that peer review often demands drastic argumentation restructuring, yet conclusions remain remarkably stable—a phenomenon we term the "Paradox of Revision. " Theoretical Prediction: If conclusions originate from pre-linguistic insights (R) rather than post-hoc argumentation (M), then exogenous variation in M (via peer review) should not affect conclusion stability (Y) —formally: M Y. Methods: We analyzed 500 neuroscience articles from eLife Reviewed Preprints (randomly sampled from a pre-processed pool of N=2, 966 articles across 18 disciplines). Using AI-based coding (DeepSeek-chat, temperature=0), we classified: (1) whether peer review targeted argumentation (Z) ; (2) magnitude of argumentation change (M: minor/moderate/substantial) ; (3) conclusion stability (Y: identical/paraphrased/qualified/changed). Results: Among 480 articles with argumentation-targeting reviews (96%), conclusion stability was near-universal (99. 8%: 479/480 stable). Critically, even among 12 cases with substantial argumentation restructuring (>30% change), zero exhibited changed conclusions (0/12 = 0%). The single changed case involved a genuine scope expansion, validating coding sensitivity. Conclusion: The near-complete absence of M→Y effects—stronger than theoretically anticipated—supports the claim that scientific argumentation serves to *anchor* pre-existing insights rather than *generate* conclusions. This has implications for understanding scientific justification, peer review efficacy, and the ontology of discovery vs. justification. Next Steps: We are scaling this protocol to multiple disciplines (pre-processed pool: 2, 966 articles) to enable instrumental variable regression analysis (IV-2SLS) for causal identification. This brief report establishes priority for the research design and pilot findings.
Wenpeng Wei (Tue,) studied this question.