Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Categories

Computer Science

Digital Systems

Machine Learning Basics

Flashcards

0/15

Still learning

Supervised Learning

Supervised Learning is a type of machine learning where the model is trained on labeled data. For example, in image classification, the training set consists of images tagged with their respective categories.

Unsupervised Learning

Unsupervised Learning involves training a model on data without labels. The model tries to infer patterns within the data. Clustering is a common example of unsupervised learning.

Overfitting

Overfitting occurs when a model learns the training data too well, including noise and outliers, which harms its performance on new, unseen data. An example of overfitting can occur in a decision tree that grows too deep and captures noise in the data.

Underfitting

Underfitting happens when a model is too simple to capture the underlying structure of the data, leading to poor performance on both training and unseen data. A linear model trying to fit non-linear data is an example of underfitting.

Reinforcement Learning

Reinforcement Learning is a type of machine learning where an agent learns to make decisions by performing actions and receiving rewards in a given environment. An example is a chess-playing algorithm that improves by playing more games.

Neural Networks

Neural Networks are computing systems inspired by the biological neural networks of animal brains. They are used in deep learning and consist of interconnected units called neurons. An example is Convolutional Neural Networks used in image recognition.

Decision Trees

Decision Trees are a type of predictive modeling algorithm that maps observations about data with conclusions about the target value. They are used for both classification and regression tasks. An example is using a decision tree to predict whether a loan applicant is a high or low credit risk.

Random Forests

Random Forests consist of many individual decision trees that operate as an ensemble. Each tree makes a prediction, and the mode or mean prediction is taken as the output depending on the task. An example is using a random forest to improve the accuracy of a classification task over a single decision tree.

Gradient Descent

Gradient Descent is an optimization algorithm used to minimize a function by iteratively moving towards the minimum value of the gradient. It is used extensively in training neural networks. $J(\theta) = \frac{1}{2m}\sum_{i=1}^m(h_\theta(x^{(i)}) - y^{(i)})^2$ is an example of a cost function being minimized.

Support Vector Machines (SVM)

Support Vector Machines are supervised learning models used for classification and regression tasks. They try to find the hyperplane that best divides a dataset into classes. An example is using SVM to classify hand-written letters based on pixel data.

K-Means Clustering

K-Means Clustering is an unsupervised learning algorithm that divides a dataset into K distinct, non-overlapping subsets. It assigns data points to the nearest cluster center. An example is segmenting customers into different groups based on purchasing behavior.

Cross-Validation

Cross-Validation is a technique for assessing how a predictive model will perform on an independent data set. It involves dividing the dataset into a number of parts or 'folds', training the model on all but one, and then testing on the remaining slice. An example is k-fold cross-validation.

Bias-Variance Tradeoff

The Bias-Variance Tradeoff is a fundamental issue that must be tackled when training machine learning models. High bias can lead to underfitting, and high variance can lead to overfitting. An example is adjusting model complexity to balance the tradeoff.

Regularization

Regularization refers to techniques that prevent overfitting by imposing penalties on model parameters that can potentially be too large, therefore encouraging simpler models. Examples include L1 (Lasso) and L2 (Ridge) regularization.

Principal Component Analysis (PCA)

Principal Component Analysis is a statistical technique that uses orthogonal transformations to convert a set of correlated variables into a set of uncorrelated variables called principal components. This is useful for dimensionality reduction. An example is reducing the dimensions of a dataset with many variables before applying a machine learning algorithm.

Know

Still learning

Click to flip

Know