January 1, 1995Open Access

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection

Key Points

Key points are not available for this paper at this time.

Abstract

We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on arti cial data and theoretical results in restricted settings have shown that for selecting a good classi er from a set of classiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment| over half a million runs of C4. 5 and a Naive-Bayes algorithm| to estimate the e ects of di erent parameters on these algorithms on real-world datasets. For crossvalidation, we vary the number of folds and whether the folds are strati ed or not; for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, the best method to use for model selection is ten-fold strati ed cross validation, even if computation power allows using more folds.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Ron Kohavi (Sun,) studied this question.

www.synapsesocial.com/papers/6a085b77280cd4e998e8b87a — DOI: https://doi.org/10.5281/zenodo.19712698

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion