Explore tens of thousands of sets crafted by our community.
Hyperparameter Tuning
15
Flashcards
0/15
Number of Epochs
The number of complete passes through the training dataset. Can range from a few to many thousands, depending on the size of the dataset and the problem complexity.
Learning Rate
Controls how much to update the model in response to the estimated error each time the model weights are updated. Commonly between 0.0 and 1.0.
Activation Function
Determines the output of a neuron given an input or set of inputs. Examples include sigmoid, tanh, ReLU.
Momentum
Enhances the gradient descent learning method by taking into account the past gradients when updating the weights. Typically ranges from 0 to 1.
Max Depth of Tree
In decision tree-based models, it limits the depth of the tree. It is a form of regularization to prevent overfitting. The value is often set based on model performance.
Number of Layers
Dictates the depth of a neural network, can range from 1 to many, depending on the complexity of the function being modeled.
Batch Size
Number of samples that will be propagated through the network at one time. Common values are 32, 64, 128, etc.
Number of Neurons per Layer
Dictates the width of a neural network layer, and can range from a few to potentially thousands. Influences the representational capacity of the network.
Weight Initialization
Sets the initial values of the weights. Common strategies include zero initialization, random initialization, and Xavier/Glorot initialization.
Dropout Rate
A regularisation technique where randomly selected neurons are ignored during training. Typically ranges from 0 to 1, with 0.5 being a common value.
Ensemble Size
In ensemble methods, the number of individual models to train and combine. Common values depend on the model complexity and computational resources
Learning Rate Decay
Gradually reduces the learning rate during training in order to let the algorithm settle at the minimum. Often implemented as an exponential decay.
Early Stopping
A form of regularization where you stop training as soon as the performance on a validation set begins to degrade.
Gradient Clipping Threshold
Prevents the gradients from becoming too large, which can cause issues with numerical stability and model convergence. The threshold can be a hyperparameter to tune.
Regularization Parameter
Controls the degree of regularization applied to prevent overfitting. Common types are L1 (lasso) and L2 (ridge). Values usually range from 0 to 1.
© Hypatia.Tech. 2024 All rights reserved.