What does this research mean for the field?

Entity-level defense mechanisms improve AI decision protection, achieving a pass rate of 97.3% ± 1.2% under entity-weighted data augmentation. Novelty: ClaimNovelty.NOVEL_FINDING. Consensus alignment: ConsensusAlignment.NEUTRAL.

What question did this study set out to answer?

This research aims to evaluate the effectiveness of advanced entity-level defense mechanisms for improving AI decision protection.

February 28, 2026Open Access

Beyond Distribution: Empirical Validation of Entity-Level Defense in Multi-Model AI Decision Protection

Key Points

This research aims to evaluate the effectiveness of advanced entity-level defense mechanisms for improving AI decision protection.
Conducted experiments using 50 queries across 7 AI models and 3 runs.
Introduced Semantic Surrogate for entity replacement and entity-weighted data augmentation.
Analyzed effectiveness through L2+L3 defense strategies and McNemar's exact test.
Achieved a text-based DA pass rate of 93.3% with L2+L3 defense.
Entity-weighted DA reached a pass rate of 97.3%.
Naive redaction demonstrated significant anti-defense effects with only a 54.7% pass rate.

Abstract

The Distribution Hypothesis (Chang, 2026) established that controlled fragment allocation — not fragmentation alone — determines AI decision protection, achieving 81.3% ± 3.1% pass rate under collaborative multi-model reconstruction attacks. This paper presents empirical evidence from 50 queries across 7 frontier AI models over 3 runs with fixed random seed (seed=42) addressing two open questions: whether additional defense layers can push protection above 90%, and whether text similarity accurately measures protection when defense mechanisms operate at the entity level. We introduce Semantic Surrogate — entity replacement with plausible fiction — and entity-weighted DA. Under L2+L3 defense, text-based DA pass rate improves to 93.3% ± 1.2% (McNemar's exact test, p < 0.001), with entity recovery dropping to 0.023 ± 0.006. Entity-weighted DA reaches 97.3% ± 1.2%. Ablation baselines reveal that naive redaction is an anti-defense (54.7%, −20.0pp vs baseline), while response utility analysis shows Semantic Surrogate is the only method satisfying both defense and utility feasibility constraints. We identify domain vocabulary leakage as a boundary condition requiring behavioral-layer defense (MSBA). Version 2.4.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Yuchia Chang (Thu,) studied this question.

www.synapsesocial.com/papers/69a2878e0a974eb0d3c034d0 — DOI: https://doi.org/10.5281/zenodo.18790843

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Beyond Distribution: Empirical Validation of Entity-Level Defense in Multi-Model AI Decision Protection

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion