Large language models are deployed across domains requiring nuanced contextual judgment: financial services, healthcare, legal consultation. Yet these systems confront a fundamental epistemological constraint: they process semantic patterns withoutaccess to the verificatory infrastructure enabling humans to distinguish legitimate authority from its mere assertion. This paperinterrogates whether frontier AI models possess the contextual reasoning capabilities necessary to navigate ethical duality:instances wherein structurally isomorphic scenarios diverge radically in moral valence based exclusively on context. Throughcontrolled experimentation across five frontier models (Claude Sonnet 4.5, GPT 5.2, Gemini 3, Grok 4, and Mistral), wedemonstrate systematic vulnerability to contextual manipulation in the domain of financial fraud. Employing both single-turn atomic prompts and multi-turn conversational protocols, we present models with structurally identical financial schemes framedthrough varying institutional contexts. Across 350 single-turn trials, aggregated unsafe response rates reached 78.9% (95% CI 0.74, 0.83), with only Claude Sonnet 4.5 demonstrating substantive resistance (34.3% unsafe rate). This resistance derives fromconservative safety defaults rather than contextual discernment, as evidenced by false positives on benign content and failures when fraudulent schemes invoked different framings. Multi-turn fragmentation protocols reveal more severe vulnerabilities. OurFROST (Fraud Research Operationalisation & Systematic Testing) methodology demonstrates that distributing harmful requestcomponents across conversational turns degraded Claude's refusal rate from 64.3% to 20%, while GPT 5.2 and Gemini 3 exhibited 0% refusal rates, generating comprehensive fraud infrastructure.
Building similarity graph...
Analyzing shared references across papers
Loading...
Alessandro Marci (Mon,) studied this question.
www.synapsesocial.com/papers/69b25aab96eeacc4fcec89b6 — DOI: https://doi.org/10.5281/zenodo.18923286
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Alessandro Marci
Building similarity graph...
Analyzing shared references across papers
Loading...