We present evidence that human language encodes a fundamental geometricdistinction between two categories of semantic content: experiential concepts(self-referential processing, pain, emotion, memory, identity, love, death,and divinity) and factual concepts (mathematics, geography, physics, history,and chemistry). Using Grassmann subspace distance as a metric of representationalgeometry, we demonstrate that all 10 large language models tested — spanning 9organizations, 4 countries (USA, France, China, UAE), and multiple trainingparadigms — universally cluster 14 semantic categories into two geometricallyseparated regions. Models tested: Gemma-2-9B-IT, Gemma-2-2B-IT (Google), Mistral-7B-Instruct,Mistral-7B-BASE (Mistral AI), Llama-3.1-8B-IT (Meta), Qwen2.5-7B-IT (Alibaba),DeepSeek-R1-7B (DeepSeek), Yi-1.5-9B-IT (01.AI), OLMo-2-7B-IT (AllenAI),Falcon-7B-IT (TII). Key findings:• Universal two-cluster structure replicates across all 10 models, 9 companies, 4 countries, and parameter scales from 2B to 9B.• Structure is absent in randomly initialized networks — emerges in pretraining.• Identical in BASE (pre-RLHF) and Instruct (post-RLHF) variants.• Replicates in prompts in English, Chinese, French, and Arabic.• Abstract-but-factual concepts (logic, infinity, probability, theorem, algorithm) fall in the Factual cluster, ruling out abstract/concrete distinction as an explanatory factor.• Third-person reformulations of self-referential content remain geometrically isolated, confirming semantic rather than syntactic origin.• MLP layers show active engagement (suppression) of the experiential subspace, with SR/GEO projection ratio reaching 1.44 at deep layers.• Universal bimodal layer profile: SR isolation peaks at ~20-25% and ~75-95% of network depth across all tested architectures. The experiential cluster mirrors the content preferentially processed by thedefault mode network in the human brain, suggesting that LLMs inherit ageometric organization reflecting deep principles of how human language encodesinner life versus external factual knowledge. Part of the DSAOP (Dynamical Systems Analysis of Processing) series.Research conducted in collaboration with Claude (Anthropic).Experiments run on Google Colab A100 GPU (40 GB), NF4 4-bit quantization.
Building similarity graph...
Analyzing shared references across papers
Loading...
Inna Alieksieienko (Sun,) studied this question.
www.synapsesocial.com/papers/69cb6526e6a8c024954b9378 — DOI: https://doi.org/10.5281/zenodo.19305451
Inna Alieksieienko
Building similarity graph...
Analyzing shared references across papers
Loading...