Explore tens of thousands of sets crafted by our community.
Fundamentals of Machine Learning
42
Flashcards
0/42
Unsupervised Learning
Machine learning using datasets without predefined labels to find structure in the data.
Learning Rate
A hyperparameter that controls how much the weights in the network are adjusted with respect to the loss gradient.
Support Vector Machine (SVM)
A supervised learning model that classifies data by finding the best hyperplane that separates all data points of one class from those of the other class.
Batch Normalization
A technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch.
Overfitting
Occurs when a model learns the training data too well, including the noise, and performs poorly on new data.
Neural Networks
A series of algorithms that attempt to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.
Backpropagation
The process by which the neural network model is trained through the updating of weights to minimize the output error.
Hyperparameter Tuning
The process of finding the most optimal parameters for a machine learning model to improve its performance.
Supervised Learning
A type of machine learning where models are trained on labeled datasets.
Train/Test Split
A method of dividing the dataset into two parts so that one can be used for training the model, and the other for testing its performance.
Underfitting
Occurs when a model is too simple to capture the underlying trend of the data and cannot fit the training data.
Confusion Matrix
A table used to describe the performance of a classification model on a set of test data for which the true values are known.
One-hot Encoding
A process that converts categorical variables into a form that could be provided to machine learning algorithms to do a better job in prediction.
Reinforcement Learning
A training methodology where an agent learns to make decisions by performing actions in an environment to maximize a reward.
Convolutional Neural Networks (CNN)
A class of deep neural networks commonly applied to analyzing visual imagery, using a series of convolutional layers.
Generative Adversarial Networks (GANs)
A class of machine learning frameworks designed to produce new data instances that resemble your training data.
Word Embedding
A representation of text where words that have the same meaning have a similar representation.
Decision Tree
A flowchart-like tree structure where an internal node represents feature(or attribute), the branch represents a decision rule, and each leaf node represents the outcome.
Transfer Learning
A machine learning method where a model developed for a task is reused as the starting point for a model on a second task.
Random Forest
An ensemble learning method that constructs multiple decision trees during training and outputs the mode of the classes for classification or mean prediction for regression.
K-Means Clustering
A type of unsupervised learning, which is used with unlabeled dataset and the goal is to find groups (clusters) based on feature similarity.
Precision
The ratio of correctly predicted positive observations to the total predicted positive observations.
Recall
The ratio of correctly predicted positive observations to the all observations in actual class.
F1 Score
A harmonic mean of precision and recall providing a balance between the two in case of uneven class distribution.
Bias-Variance Tradeoff
The property of machine learning models where the increase in bias leads to a decrease in variance, and the increase in variance leads to a decrease in bias.
Principal Component Analysis (PCA)
A statistical technique to simplify a dataset by reducing its number of dimensions while preserving as much variance as possible.
Boosting
An ensemble technique where a series of weak learners become strong learners by focusing on the misclassified examples of the previous model.
Ensemble Learning
The use of multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.
Activation Function
A function applied to the output of a neuron in a neural network, which determines if it should be activated or not.
Cross-Validation
A statistical method used to estimate the skill of a machine learning model on new data by partitioning the dataset into subsets.
Feature Scaling
A method used to standardize the range of independent variables or features of data.
Bagging
An ensemble technique where multiple versions of a predictor are trained on different subsets of training data, and their results are averaged (bootstrap aggregation).
Data Preprocessing
A process applied to raw data to prepare it for further analysis and modeling, typically including cleaning, normalization, transformation, and feature selection.
Loss Function
A method to calculate the difference between the model's prediction and the actual data, used during training to optimize the model.
Gradient Descent
An optimization algorithm that iteratively moves towards the minimum of a function by taking steps proportional to the negative of the gradient.
Natural Language Processing (NLP)
A field of AI that gives machines the ability to read, understand and derive meaning from human languages.
Model Evaluation
The process of determining the performance of a machine learning model on a specific set of data, often unseen before, by using metrics like accuracy, precision, recall.
Dropout
A regularization technique for reducing overfitting by preventing complex co-adaptations on training data during training of neural networks.
Regularization
The process of adding information in order to prevent overfitting, typically by adding a penalty on the magnitude of parameters of the model.
Dimensionality Reduction
The process of reducing the number of random variables under consideration by obtaining a set of principal variables.
Anomaly Detection
The identification of rare items, events, or observations which raise suspicions by differing significantly from the majority of the data.
Imbalanced Classes
Occurs when a class within a dataset significantly outnumbers other classes, leading to challenges in training effective models.
© Hypatia.Tech. 2024 All rights reserved.