Key points are not available for this paper at this time.
We review accuracy estimation methods and compare the two most common methods: crossvalidation and bootstrap. Recent experimental results on arti cial data and theoretical results in restricted settings have shown that for selecting a good classi er from a set of classiers (model selection), ten-fold cross-validation may be better than the more expensive leaveone-out cross-validation. We report on a largescale experiment| over half a million runs of C4. 5 and a Naive-Bayes algorithm| to estimate the e ects of di erent parameters on these algorithms on real-world datasets. For crossvalidation, we vary the number of folds and whether the folds are strati ed or not; for bootstrap, we vary the number of bootstrap samples. Our results indicate that for real-word datasets similar to ours, the best method to use for model selection is ten-fold strati ed cross validation, even if computation power allows using more folds.
Building similarity graph...
Analyzing shared references across papers
Loading...
Ron Kohavi (Sun,) studied this question.
www.synapsesocial.com/papers/6a085b77280cd4e998e8b87a — DOI: https://doi.org/10.5281/zenodo.19712698
Ron Kohavi
Stanford University
Building similarity graph...
Analyzing shared references across papers
Loading...