Logo
Pattern

Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Hyperparameter Tuning

15

Flashcards

0/15

Still learning
StarStarStarStar

Number of Epochs

StarStarStarStar

The number of complete passes through the training dataset. Can range from a few to many thousands, depending on the size of the dataset and the problem complexity.

StarStarStarStar

Learning Rate

StarStarStarStar

Controls how much to update the model in response to the estimated error each time the model weights are updated. Commonly between 0.0 and 1.0.

StarStarStarStar

Activation Function

StarStarStarStar

Determines the output of a neuron given an input or set of inputs. Examples include sigmoid, tanh, ReLU.

StarStarStarStar

Momentum

StarStarStarStar

Enhances the gradient descent learning method by taking into account the past gradients when updating the weights. Typically ranges from 0 to 1.

StarStarStarStar

Max Depth of Tree

StarStarStarStar

In decision tree-based models, it limits the depth of the tree. It is a form of regularization to prevent overfitting. The value is often set based on model performance.

StarStarStarStar

Number of Layers

StarStarStarStar

Dictates the depth of a neural network, can range from 1 to many, depending on the complexity of the function being modeled.

StarStarStarStar

Batch Size

StarStarStarStar

Number of samples that will be propagated through the network at one time. Common values are 32, 64, 128, etc.

StarStarStarStar

Number of Neurons per Layer

StarStarStarStar

Dictates the width of a neural network layer, and can range from a few to potentially thousands. Influences the representational capacity of the network.

StarStarStarStar

Weight Initialization

StarStarStarStar

Sets the initial values of the weights. Common strategies include zero initialization, random initialization, and Xavier/Glorot initialization.

StarStarStarStar

Dropout Rate

StarStarStarStar

A regularisation technique where randomly selected neurons are ignored during training. Typically ranges from 0 to 1, with 0.5 being a common value.

StarStarStarStar

Ensemble Size

StarStarStarStar

In ensemble methods, the number of individual models to train and combine. Common values depend on the model complexity and computational resources

StarStarStarStar

Learning Rate Decay

StarStarStarStar

Gradually reduces the learning rate during training in order to let the algorithm settle at the minimum. Often implemented as an exponential decay.

StarStarStarStar

Early Stopping

StarStarStarStar

A form of regularization where you stop training as soon as the performance on a validation set begins to degrade.

StarStarStarStar

Gradient Clipping Threshold

StarStarStarStar

Prevents the gradients from becoming too large, which can cause issues with numerical stability and model convergence. The threshold can be a hyperparameter to tune.

StarStarStarStar

Regularization Parameter

StarStarStarStar

Controls the degree of regularization applied to prevent overfitting. Common types are L1 (lasso) and L2 (ridge). Values usually range from 0 to 1.

Know
0
Still learning
Click to flip
Know
0
Logo

© Hypatia.Tech. 2024 All rights reserved.