Data leakage affected 294 published papers across 17 scientific fields (Kapoor & Narayanan, 2023). The dominant response has been documentation: checklists, linters, best-practice guides. Documentation does not prevent these failures. This paper proposes a structural remedy: a grammar that decomposes the supervised learning lifecycle into 7 kernel primitives connected by a typed directed acyclic graph (DAG), with four hard constraints that reject the two most damaging leakage classes at call time. The grammar's core contribution is the terminal assess constraint: a runtime-enforced evaluate/assess boundary where repeated test-set assessment is rejected by a guard on a nominally distinct Evidence type. A companion study across 2, 047 experimental instances quantifies why this matters: selection leakage inflates performance by dᵦ = 0. 93 and memorization leakage by dᵦ = 0. 53-1. 11. Three separate implementations (Python, R, and Julia) confirm the claims. The appendix specification lets anyone build a conforming version.
Building similarity graph...
Analyzing shared references across papers
Loading...
Simon Roth
Building similarity graph...
Analyzing shared references across papers
Loading...
Simon Roth (Sat,) studied this question.
www.synapsesocial.com/papers/69ada90bbc08abd80d5bc5df — DOI: https://doi.org/10.5281/zenodo.18905072