Retrieval-Augmented Generation (RAG) and multi-agent LLM systems are increasingly deployed in enterprise settings that handle sensitive data. While traditional software enforces least privilege via database predicates and API authorization, semantic retrieval and agent delegation introduce new authorization failure modes that are not captured by conventional access-control models. This Systematization of Knowledge (SoK) paper organizes the authorization security landscape for agentic RAG. We (i) define a threat model and correctness criteria for authorization in semantic retrieval pipelines, (ii) develop a taxonomy of authorization failure modes including semantic overfetch, crossdomain synthesis leakage, and delegation escalation, (iii) classify and evaluate existing mitigation families such as rolepartitioned indices, metadata-tag filtering, post-generation redaction, and agent tool-scope controls, and (iv) systematize a unifying correctness property, Authorization-First Retrieval (AFR), which treats authorization as a precondition to defining the semantic retrieval candidate space. AFR is an ordering property that many defenses implicitly aim for but do not state formally.
Rohith Namboothiri (Fri,) studied this question.