Explore tens of thousands of sets crafted by our community.
Activation Functions in Neural Networks
10
Flashcards
0/10
Tanh
The graph is a rescaled S-shaped curve that maps any real-valued number to the (-1, 1) range. Often used in hidden layers to normalize the mean of the inputs to be closer to zero, which often leads to faster convergence.
Softplus
The graph represents a smooth approximation to the ReLU function and is also continuously differentiable. The softplus activation is useful when the continuity of the activation function is required.
ELU (Exponential Linear Unit)
The graph is similar to ReLU but smoothens the negative side using an exponential decay function towards a negative saturation value instead of being zero. This helps to mitigate the vanishing gradient problem for negative inputs while keeping some of the benefits of ReLU.
Leaky ReLU
The graph is similar to ReLU, but instead of being zero when the input is negative, a small, non-zero slope ensures that the function never 'dies'. This variant can prevent the problem of 'dead neurons' in a ReLU network.
Swish
The graph is a smooth, non-monotonic function that combines elements from both sigmoid and ReLU. Swish can sometimes outperform common activations like ReLU, particularly in deep models.
Sigmoid
The graph is an S-shaped curve that maps any real-valued number to the (0, 1) range. Typically used in binary classification problems as the final activation function.
Identity
The graph is a straight line passing through the origin with a slope of 1. This activation function is used when the output of the neuron is equal to the input, typically for problems where the output is expected to be in the same range as the input.
ReLU (Rectified Linear Unit)
The graph is a piecewise linear function that outputs the input directly if it is positive, else it will output zero. It is currently one of the most popular activation functions for deep neural networks, due to its computational efficiency and sparsity it induces in activations.
Parametric ReLU (PReLU)
This is a type of Leaky ReLU where the slope for negative inputs is learned during the training. This variation can be seen as adding a small amount of complexity to the model, but enabling it to adapt the activation function to the specific task.
Softmax
The graph is not a curve, but a multi-dimensional function that can be interpreted as providing the probabilities of each class in a classification task. It is often used in the output layer of a network for multi-class classification problems.
© Hypatia.Tech. 2024 All rights reserved.