November 9, 2025Open Access

The Structure of Cross-Validation Error: Stability, Covariance, and Minimax Limits

Key Points

Even with a large number of folds, cross-validation cannot reach the ideal risk estimation error of 1/n, demonstrating the inherent limitations of the method.
The study defines a new concept of squared loss stability, contributing to understanding algorithmic stability in risk estimation.
Minimax performance analysis shows cross-validation's efficiency is bounded between the k/n and √k/n regimes, highlighting a trade-off for data utilization in resampling.
Learning algorithms exhibit diverse behaviors, affecting the correlation of error estimates and their implications for cross-validation performance.

Abstract

Despite ongoing theoretical research on cross-validation (CV), many theoretical questions about CV remain widely open. This motivates our investigation into how properties of algorithm-distribution pairs can affect the choice for the number of folds in k-fold cross-validation. Our results consist of a novel decomposition of the mean-squared error of cross-validation for risk estimation, which explicitly captures the correlations of error estimates across overlapping folds and includes a novel algorithmic stability notion, squared loss stability, that is considerably weaker than the typically required hypothesis stability in other comparable works. Furthermore, we prove: 1. For every learning algorithm that minimizes empirical error, a minimax lower bound on the mean-squared error of k-fold CV estimating the population risk LD: \ ₊ ₍\; ₃\; E\![ (L₂ₕ^ (k) - L₃) ^2 \;=\; Ω\! (k/n), \] where n is the sample size and k the number of folds. This shows that even under idealized conditions, for large values of k, CV cannot attain the optimum of order 1/n achievable by a validation set of size n, reflecting an inherent penalty caused by dependence between folds. 2. Complementing this, we exhibit learning rules for which \ ₃\; E\![ (L₂ₕ^ (k) - L₃) ^2 \;=\; Ω (k/n), \] matching (up to constants) the accuracy of a hold-out estimator of a single fold of size n/k. Together these results delineate the fundamental trade-off in resampling-based risk estimation: CV cannot fully exploit all n samples for unbiased risk evaluation, and its minimax performance is pinned between the k/n and k/n regimes.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Nachum et al. (Wed,) studied this question.

www.synapsesocial.com/papers/690fdcdaf60c54d04ea3821b — DOI: https://doi.org/10.48550/arxiv.2511.03554

Authors

Ido Nachum

Rüdiger Urbanke

Thomas Weinberger

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Structure of Cross-Validation Error: Stability, Covariance, and Minimax Limits

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion