The EU AI Act (Regulation 2024/1689) establishes comprehensive requirements for logging, monitoring, and human oversight of high-risk AI systems. However, the Act contains a structural gap: it mandates traceability without specifying that the infrastructure producing the evidence must be architecturally independent from the system being observed. I argue that traceability which depends on the observed system's cooperation is not traceability — it is self-reporting with regulatory decoration. Drawing on Nannini et al. (2026) identifying an absent "fourth tier" in AI compliance architecture, Carli et al. (2026) highlighting risks of diluted oversight in the Digital Omnibus proposal, Carro et al. (2025) whose conceptual framework for AI capability evaluations establishes the boundaries of the pre-deployment domain, and the epistemological critique by Pfister and Jud (2025) demonstrating that pre-deployment benchmarks cannot predict post-deployment behavior, I identify a temporal blind spot in the current evaluation paradigm: there is no independent observation infrastructure for AI systems already in production. I describe the architectural properties required for such infrastructure and present a running implementation demonstrating their feasibility.
Nehuén Eluney Mercado (Fri,) studied this question.