What question did this study set out to answer?

The central aim is to introduce a navigation architecture that allows large language models to efficiently respond to complex normative queries without retraining.

March 12, 2026Open Access

Skill Without Training: Deterministic Knowledge Navigation for Large Language Models over Structured Documents

Key Points

The central aim is to introduce a navigation architecture that allows large language models to efficiently respond to complex normative queries without retraining.
Developed a navigation architecture using a 3-pass ontology compiler to create machine-readable indices.
Implemented a Model Context Protocol (MCP) server for query handling.
Utilized deterministic pipelines for keyword enrichment, index search, and weighted reading plan construction.
Achieved 100% accuracy on 246 evaluation questions compared to ~85% accuracy without indices under high variance.
Demonstrated significant token reduction (7× vs agentic, 15× vs full-context) while maintaining performance.
Conducted evaluations across multiple categories, ensuring comprehensive testing of the architecture.

Abstract

This preprint presents a deterministic navigation architecture that enables unmodified frontier LLMs to achieve 97–100% accuracy on complex specification queries — without fine-tuning, retraining, or embedding-based retrieval. Core Idea Large Language Models fail on large normative specifications not because they lack intelligence, but because they lack deterministic navigation. A 500-page ISO-grade document has deep hierarchical numbering, normative modality (MUST/SHOULD/MAY), cross-references across independent chapters, and no linear reading path. Instead of scaling context windows or using generic RAG, this work encodes the expert's mental map as a formal navigation algorithm. Architecture A 3-pass ontology compiler ingests a raw specification and produces 14 machine-readable indices. At runtime, an MCP (Model Context Protocol) server uses these indices to answer queries via a deterministic pipeline: Keyword enrichment (T&D expansion → co-occurrence → waterfall confirmation) Index search with ontological routing (WHAT/WHY/HOW/WHEN) Weighted reading plan construction Chain resolution → tier-weighted extraction Token budget management with auto-expansion Key Results Metric MCP System Baseline (no indices) Accuracy 100% (246 questions, 5 runs each) ~85% with high variance Spec tokens per query ~2–4k 25–30k (agentic) / 178k (full-context) Tool calls per query 1–2 15–20 Wall time ~3s 25–35s Token reduction 7× vs agentic, 15× vs full-context — Tested on Claude 3.5 Sonnet through Claude Sonnet 4 and reproduced on GPT-4. No fine-tuning — just structured tooling. Evaluation Corpus The evaluation corpus — the E.L.I.A. specification (~500 pages) — is guaranteed not to appear in any LLM training corpus: it is a private, unpublished language specification developed as part of this research. This eliminates data contamination risk entirely and ensures that every correct answer is attributable to the navigation architecture, not memorized training data. Benchmark 246 evaluation questions across 4 categories: Factual (82), Cross-referential (74), Philosophical/Abstract (45), Normative (45) 36-question hard subset targeting non-obvious, cross-referential, and philosophical queries 10-question conceptual set covering architectural, semantic, and design aspects 20-question demo set (Factual + Cross-referential) runnable against the open-source demo All benchmark questions, raw answer dumps, scoring tables, and the agentic benchmark runner are included in the companion repository. Supplementary Materials Three annexes accompany this preprint: Annex A — Trace Dumps: Three real MCP debug traces showing the full deterministic pipeline for WHAT, HOW, and HOW-Code queries Annex B — Test Methodology: Complete evaluation protocol — 4 evaluation modes, 3-axis scoring rubric (accuracy × completeness × precision), 4-artifact per-question format, 3 models tested, 800+ evaluation artifacts Annex C — Incremental Ablation: Layer-by-layer system build-up showing accuracy progression from 70% to 99+% Companion Repository The open-source companion repository contains: MCP Spec-Reader server (TypeScript) with the full deterministic pipeline 14 pre-built JSON index files Index build scripts (compiler) Agentic benchmark runner All benchmark question files (246 + 36 + 10 + 20) Two representative specification sections for demo evaluation Agent configuration files (specification structure, patterns, ontological mappings) Repository: chudinovuv-arhiveX-1-demo Patent Status Patent pending. Provisional patent applications covering the navigation architecture, index compilation method, and auto-regressive compiler described in this paper have been filed. Citation Chudinov, Y. (2026). Skill Without Training: Deterministic Knowledge Navigation for Large Language Models over Structured Documents. Preprint.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Yurii Chudinov

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Skill Without Training: Deterministic Knowledge Navigation for Large Language Models over Structured Documents

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider