What question did this study set out to answer?

Investigate the phenomenon of normalization of deviance in autonomous AI agent systems and propose effective monitoring techniques.

March 25, 2026Open Access

Detecting Normalization of Deviance in Multi-Agent Systems: Empirical Evidence for Graph-Based Behavioral Drift Detection

Key Points

Investigate the phenomenon of normalization of deviance in autonomous AI agent systems and propose effective monitoring techniques.
Conducted three independent test runs to assess drift detection methods
Compared stateful session tracking with stateless mode
Analyzed vulnerabilities in application gateways regarding HTTP protocols
Stateful session tracking detected 6.7% behavioral drift, while stateless mode had 19.3% undetected drift
Demonstrated the existence of blind spots in defense-in-depth strategies
Identified zero security in MCP protocols provided by HTTP gateways

Abstract

Autonomous AI agent systems exhibit gradual behavioral drift — termed normalization of deviance — that systematically evades threshold-based monitoring approaches. We present empirical evidence from three independent test runs (72% to 100% pass rate) and a 19-day production silent failure demonstrating that: (1) stateful session tracking detects 6.7% drift vs 19.3% undetected in stateless mode; (2) defense-in-depth creates blind spots where gateways mask application vulnerabilities; (3) HTTP gateways provide zero MCP protocol security. We propose graph-based TSAD as the methodological framework for multi-agent behavioral monitoring.

Detecting Normalization of Deviance in Multi-Agent Systems: Empirical Evidence for Graph-Based Behavioral Drift Detection

Key Points

Abstract

Cite This Study