Explore tens of thousands of sets crafted by our community.
Cross-validation Techniques
8
Flashcards
0/8
Nested Cross-Validation
Uses two layers of cross-validation to assess the generalization performance of a model and to tune hyperparameters. It is suitable for small datasets and when model selection is important.
K-Fold Cross-Validation
Splits the data into K equal subsets. Each fold serves as the test set once, and as the training set K-1 times. Appropriate for small to medium-sized datasets.
Holdout Method
Splits the data into two sets: the training set and the test set. It is appropriate to use when you have a large dataset and want a quick evaluation.
Leave-One-Out Cross-Validation (LOOCV)
A special case of k-fold cross-validation where k equals the number of data points. It is very computationally expensive but reduces bias. Best for very small datasets.
Leave-P-Out Cross-Validation (LPOCV)
Leaves P data points out of the training data, and uses these P points as the validation set. Appropriate for small datasets or for assessing the impact of removing P points.
Time Series Cross-Validation
A method that involves rolling train/test splits and is appropriate for time-dependent data. Prevents the model from being tested on past data when predicting the future.
Repeated Random Subsampling Validation
Randomly splits the data into training and test sets multiple times. Results are averaged over the splits. Useful when the dataset is too large for exhaustive methods.
Stratified K-Fold Cross-Validation
Similar to K-fold but each fold contains approximately the same percentage of samples of each target class as the complete set. It's appropriate when data has imbalanced class distributions.
© Hypatia.Tech. 2024 All rights reserved.