Most AI agent governance frameworks are written before the capability graph of the system being governed is known. This produces frameworks that constrain intended behavior while leaving emergent behavior ungoverned—because emergent behavior, by definition, follows paths the framework’s authors did not anticipate. This paper proposes a three-step sequencing that inverts the standard approach: map the capability graph first, fuzz it with an agentic red-team harness second, and derive the governance framework from the validated exploit chains third. It presents a complete four-layer implementation: a YAML capability graph schema (nodes, edges, risk scoring), an agentic red-team harness (task-driven fuzzing with constrained obvious paths), a governance framework (privilege envelopes, stance gating, sharing controls), and negative-space diagnostics (detection of blocked, attempted, and bypass edges). The framework is instantiated against a browser-only agent environment—the cleanest substrate for capability-graph mapping because it forces the agent to reason its way through the environment rather than brute-forcing via shell commands—with four fuzzing scenarios and a governance dashboard schema.
Narnaiezzsshaa Truong (Mon,) studied this question.