What question did this study set out to answer?

This research investigates the relationship between effective rank and intrinsic dimension in weight matrices of large language models.

May 26, 2026Open Access

Full Rank is an Illusion: Weight Matrices in Large Language Models are Composites of Low-Dimensional Sub-Manifolds

Key Points

This research investigates the relationship between effective rank and intrinsic dimension in weight matrices of large language models.
Analyzed weight matrices by functional boundaries such as attention heads and Q/K/V segments.
Conducted experiments on models Qwen3.6-27B and DeepSeek V4 Flash (280B).
Calculated eRank and TwoNN ratios to understand structural compositions.
The ratio of eRank to TwoNN decreases significantly when splitting weight matrices, dropping by 1-2 orders of magnitude.
Per-head ratios across both models converge to 4-9x despite a 10x difference in total parameter count.
Value heads in DeltaNet linear attention show a nearly flat ratio of about 2x.

Abstract

We show that the large gap between effective rank (eRank) and TwoNN intrinsic dimension in LLM weight matrices is not evidence of manifold curvature, but of sub-manifold concatenation. When weight matrices are split along known functional boundaries — attention heads, Q/K/V segments — the eRank/TwoNN ratio drops by 1–2 orders of magnitude. Experiments on Qwen3.6-27B and DeepSeek V4 Flash (280B) demonstrate that per-head ratios converge to 4–9x across both models despite 10x difference in parameter count. Value heads in DeltaNet linear attention are nearly flat (ratio ≈ 2x). Deep MoE expert weights develop internal sub-manifold structure absent in shallow layers. Code and data are open-sourced.

Full Rank is an Illusion: Weight Matrices in Large Language Models are Composites of Low-Dimensional Sub-Manifolds

Key Points

Abstract

Cite This Study