Untrusted text entering AI agent pipelines, template engines, and identity systems carries invisibleattacks: homoglyph substitution that bypasses keyword filters, zero-width characters that splitdelimiters, Unicode Tag block characters that encode instructions tokenizers read but humanscannot see, and bidirectional overrides that reorder displayed text. These attacks operate belowthe layer where existing defenses—HTML escaping, schema validation, probabilistic detection—aredesigned to function. navi-sanitize is a zero-dependency Python library that removes these vectorsdeterministically at the input boundary. A six-stage pipeline removes null bytes, strips 411 invisiblecharacters, applies NFKC normalization, replaces 54 targeted homoglyphs with Latin equivalents,re-normalizes to guarantee idempotency, and runs a pluggable context-specific escaper—producingidentical output for identical input, with zero false positives on legitimate Unicode text. Clean-pathlatency is 2.8 µs per string.
N. D. Spence (Sat,) studied this question.