This technical note presents a minimal synthetic stress test on AI-assisted exploratory workflows under weak referential anchoring. Using a 24-item toy dataset across four evidence regimes (explicit support, distributed integration, weak-binding traps, and legitimate deferral), the study evaluates a single instruct model under three conditions: full evidence, weak evidence, and weak evidence with sequential gating. The main result is local and diagnostic: under weak evidence, unsupported closure and forced closure remain high, while sequential gating suppresses unsupported closure in the tested setup, but only at the cost of a substantial increase in legitimate deferral and a residual loss of answer production even on part of the directly supported cases. The note does not claim a general hallucination-mitigation solution; its purpose is to isolate a structural trade-off between unsupported closure and over-deferral under weak anchoring. The reproducibility bundle includes the paper, a single Python script, the frozen 24-item synthetic dataset, summary CSV files, figure files, and a short README.
Building similarity graph...
Analyzing shared references across papers
Loading...
Danilo Tavella (Sun,) studied this question.
www.synapsesocial.com/papers/69d49fe5b33cc4c35a2285df — DOI: https://doi.org/10.5281/zenodo.19427628
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Danilo Tavella
Building similarity graph...
Analyzing shared references across papers
Loading...