This preprint presents IAGA, a control-plane architecture for governing Large Language Model (LLM) API usage in enterprise environments. The system introduces deterministic cost enforcement, bounded budget guarantees, circuit-breaker-based reliability, synchronous response validation, and cryptographically scoped multi-tenant isolation. IAGA operates as an OpenAI-compatible gateway, requiring zero application-level changes while enforcing governance at request time. The paper details the full request lifecycle, cost containment model, caching architecture, reliability mechanisms, and controlled performance evaluation under simulated enterprise workloads.
Edoardo Bambini (Thu,) studied this question.