This paper provides the first rigorous mathematical analysis of a formula that has been implicitly used in neural network normalization for 50 years: Semantic = v / ||v||. We do not propose this as a new method or invention, but rather as a mathematical re-examination of a technique that has been empirically successful since the 1960s yet never fully understood. We prove that all normalized neural networks, regardless of architecture, task, or training method, encode information through deterministic high-dimensional vector decomposition into two orthogonal components: a semantic vector that carries meaningful information, and a stability scalar that ensures numerical stability. The key insight is that the unit vector vd / ||v|| precisely isolates the semantic contribution of each neuron—its sign indicates its polarity on a semantic axis, and its magnitude its contribution strength—while the global norm ||v|| serves purely as a stability mechanism. This decomposition reveals that neural networks naturally separate semantic encoding from numerical stability through deterministic mathematical structure—a principle that has been implicitly used for decades but never mathematically characterized. Our framework provides the missing mathematical foundation for 50 years of normalization practice and resolves several longstanding mysteries in interpretability research.
Building similarity graph...
Analyzing shared references across papers
Loading...
YingXu Wang
Building similarity graph...
Analyzing shared references across papers
Loading...
YingXu Wang (Mon,) studied this question.
www.synapsesocial.com/papers/69df2b85e4eeef8a2a6b085f — DOI: https://doi.org/10.5281/zenodo.19556542