Key points are not available for this paper at this time.
Paper 2 v3 in the friction-theory paper series. This version is a substantive revision of v2 (DOI 10.5281/zenodo.20127860) with one new substantial section (§2.5b) and cascading updates to abstract, §2.6, §5.1, §5.4, §6, and bibliography. The capacity-scaling substance of v1/v2 (Findings 1-3) is unchanged and is reproduced ord-for-ord in §2.1-§2.5; v3 adds Finding 4 documenting empirical convergence to a substrate-universal race-architecture floor at 85 % application accuracy. What is new in v3: §2.5b (new, ~900 words): Race-architecture floor — empirical convergence to 85 % application across substrates. Three substrates from different organisations — Llama-3.3-70B-Instruct (Meta, dense), DeepSeek-V3 (DeepSeek, MoE), and Cogito-V2.1-671B (DeepCogito, IDA-distilled MoE) — converge to the same 85 % application accuracy on the Zorbetik domain despite different architectures and fine-tuning paradigms. Two additional 405 B-class fine-tunes (Llama-3.1-405B-Instruct older Meta Instruct; Hermes-4-405B Nous IDA) diverge below the floor for paradigm-specific reasons, confirming that fine-tuning regime — not capacity alone — determines whether a substrate reaches the floor. The 85 % floor is interpreted as the empirical manifestation of a race-architecture floor on the application task (Paper 1 §2.5; Paper 10 §1.5): capacity buys depth-tolerance, not depth-immunity. Structural-impossibility analogy. §2.5b makes the race-architecture floor explicit as a structural upper bound, not a technological one — analogous to thermodynamic ceilings (Carnot limit on heat-engine efficiency). No substrate operating under R1 (parallel candidate routes) can achieve P(correct) = 1 on a non-trivial task regardless of scale, training, or fine-tuning regime. Sub-floor error rates on similarly structured tasks require architecture above the substrate (multi-model ensembling, formal verification layers, human-in-the-loop pipelines), not larger substrates. Ensembling lowers the aggregate error rate but does not abolish the floor. Abstract updated with Finding 4 (race-architecture floor + 4-substrate convergence). §2.6 Theoretical interpretation reformulated to address both cloze floor (~90 %) and application floor (~85 %) as substrate-universal race-architecture floors; "intelligence headroom" refined to bounded-above by the application floor, not by 100 % accuracy. §5.1 What this paper establishes extended from three findings + methodological recommendation to four findings + methodological recommendation; new Finding 4 captures the substrate-universal race-architecture floor result. §5.4 Future work adds a new first bullet introducing Paper 2C (in preparation) as the chain-depth-axis companion (RACE-50 benchmark on a 327-substance invented domain with algorithmically validated DAG depth 64), testing whether race-architecture-floor manifestation on the chain-depth axis is substrate-universal in curve shape across model capacity scales. §6 Conclusion updated to four findings; cross-cite to Paper 2C added. Paper 2B cross-cite paragraph added in §2.5b connecting winning-route-amplification (Paper 2B substrate-mechanism companion) to the empirical floor: both findings point to a common substrate-mechanism beneath ICL/FT distinction and capacity-floor convergence. §8 References: Paper 2C entry added (Pødenphant Lund 2026Y, in preparation). What is unchanged from v2: all of §2.1 Design, §2.2-§2.5 Findings 1-3 (monotonic application scaling, bottleneck migration, MoE active-parameter scaling), §2.7-§2.10 (Yerkes-Dodson, first-token friction, caveats, somatic markers as field-layer prerequisite for elaborative encoding), §3 (Methodological note on frontloaded ICL), §4 (Scope note on encoding-battery / Paper 4), §5.2 (Implications for C-dimension), §5.3 (Limitations). These sections are reproduced word-for-word from v2. Abstract. Large language models solve two differentiable task types on the same underlying knowledge base. Cloze retrieval saturates early (~90 % by 8 B parameters); application scales monotonically across three orders of magnitude (2 % at 0.5 B to 85 % at 70 B). We document this asymmetry on a single invented knowledge domain ("Zorbetik") across nine models, using frontloaded in-context learning (Brown et al. 2020) to expose encoding-to-retrieval dynamics without weight updates. Four findings: (1) Application scales monotonically with capacity (Spearman ρ = +1.000 on Qwen2.5; cross-family panel ρ = +0.92, n=9; slope +40.8 pp per decade); (2) Bottleneck migrates with capacity — at 0.5 B retrieval fails, at 14 B 36 % of questions show "retrieval succeeds, derivation fails"; (3) Mixture-of-Experts models scale on active parameters, not total; (4) NEW IN V3: Three substrates from different organisations converge to the same 85 % application accuracy, identifying a substrate-universal race-architecture floor (Paper 1 §2.5; Paper 10 §1.5). Capacity buys depth-tolerance, not depth-immunity. We recommend frontloaded ICL as the operational instrument for encoding-retrieval studies of this kind in place of fine-tuning. A companion paper (Paper 2C, in preparation) develops a controlled chain-depth benchmark testing whether the same floor manifests as substrate-universal depth-degradation across model capacity scales. Companion papers in the series: Paper 0 (BFT): 10.5281/zenodo.19462500 Paper 1 (FT generalised): 10.5281/zenodo.20012655 Paper 2B (substrate-mechanism companion, in preparation) Paper 2C (chain-depth axis companion, in preparation) Paper 3 (Friction-guided inference): 10.5281/zenodo.20014122 Paper 10 (Race-architecture, physics scope): 10.5281/zenodo.20014568 Data, fine-tuning notebooks, analysis scripts: https://github.com/tplund/friction-theory-p2-capacity-scaling
Building similarity graph...
Analyzing shared references across papers
Loading...
Tomas Pødenphant Lund (Thu,) studied this question.
www.synapsesocial.com/papers/6a0809f1a487c87a6a40bd9c — DOI: https://doi.org/10.5281/zenodo.20187513
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:
Tomas Pødenphant Lund
Aarhus University
Building similarity graph...
Analyzing shared references across papers
Loading...