Explore tens of thousands of sets crafted by our community.
Fundamentals of Machine Learning
42
Flashcards
0/42
F1 Score
A harmonic mean of precision and recall providing a balance between the two in case of uneven class distribution.
Ensemble Learning
The use of multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.
Cross-Validation
A statistical method used to estimate the skill of a machine learning model on new data by partitioning the dataset into subsets.
Loss Function
A method to calculate the difference between the model's prediction and the actual data, used during training to optimize the model.
Convolutional Neural Networks (CNN)
A class of deep neural networks commonly applied to analyzing visual imagery, using a series of convolutional layers.
Imbalanced Classes
Occurs when a class within a dataset significantly outnumbers other classes, leading to challenges in training effective models.
Underfitting
Occurs when a model is too simple to capture the underlying trend of the data and cannot fit the training data.
Recall
The ratio of correctly predicted positive observations to the all observations in actual class.
Hyperparameter Tuning
The process of finding the most optimal parameters for a machine learning model to improve its performance.
K-Means Clustering
A type of unsupervised learning, which is used with unlabeled dataset and the goal is to find groups (clusters) based on feature similarity.
Activation Function
A function applied to the output of a neuron in a neural network, which determines if it should be activated or not.
Unsupervised Learning
Machine learning using datasets without predefined labels to find structure in the data.
Reinforcement Learning
A training methodology where an agent learns to make decisions by performing actions in an environment to maximize a reward.
Backpropagation
The process by which the neural network model is trained through the updating of weights to minimize the output error.
Support Vector Machine (SVM)
A supervised learning model that classifies data by finding the best hyperplane that separates all data points of one class from those of the other class.
Random Forest
An ensemble learning method that constructs multiple decision trees during training and outputs the mode of the classes for classification or mean prediction for regression.
Word Embedding
A representation of text where words that have the same meaning have a similar representation.
Decision Tree
A flowchart-like tree structure where an internal node represents feature(or attribute), the branch represents a decision rule, and each leaf node represents the outcome.
Neural Networks
A series of algorithms that attempt to recognize underlying relationships in a set of data through a process that mimics the way the human brain operates.
Bias-Variance Tradeoff
The property of machine learning models where the increase in bias leads to a decrease in variance, and the increase in variance leads to a decrease in bias.
Natural Language Processing (NLP)
A field of AI that gives machines the ability to read, understand and derive meaning from human languages.
Dimensionality Reduction
The process of reducing the number of random variables under consideration by obtaining a set of principal variables.
Model Evaluation
The process of determining the performance of a machine learning model on a specific set of data, often unseen before, by using metrics like accuracy, precision, recall.
One-hot Encoding
A process that converts categorical variables into a form that could be provided to machine learning algorithms to do a better job in prediction.
Confusion Matrix
A table used to describe the performance of a classification model on a set of test data for which the true values are known.
Anomaly Detection
The identification of rare items, events, or observations which raise suspicions by differing significantly from the majority of the data.
Supervised Learning
A type of machine learning where models are trained on labeled datasets.
Feature Scaling
A method used to standardize the range of independent variables or features of data.
Regularization
The process of adding information in order to prevent overfitting, typically by adding a penalty on the magnitude of parameters of the model.
Precision
The ratio of correctly predicted positive observations to the total predicted positive observations.
Transfer Learning
A machine learning method where a model developed for a task is reused as the starting point for a model on a second task.
Train/Test Split
A method of dividing the dataset into two parts so that one can be used for training the model, and the other for testing its performance.
Principal Component Analysis (PCA)
A statistical technique to simplify a dataset by reducing its number of dimensions while preserving as much variance as possible.
Boosting
An ensemble technique where a series of weak learners become strong learners by focusing on the misclassified examples of the previous model.
Overfitting
Occurs when a model learns the training data too well, including the noise, and performs poorly on new data.
Dropout
A regularization technique for reducing overfitting by preventing complex co-adaptations on training data during training of neural networks.
Generative Adversarial Networks (GANs)
A class of machine learning frameworks designed to produce new data instances that resemble your training data.
Data Preprocessing
A process applied to raw data to prepare it for further analysis and modeling, typically including cleaning, normalization, transformation, and feature selection.
Batch Normalization
A technique for training very deep neural networks that standardizes the inputs to a layer for each mini-batch.
Gradient Descent
An optimization algorithm that iteratively moves towards the minimum of a function by taking steps proportional to the negative of the gradient.
Learning Rate
A hyperparameter that controls how much the weights in the network are adjusted with respect to the loss gradient.
Bagging
An ensemble technique where multiple versions of a predictor are trained on different subsets of training data, and their results are averaged (bootstrap aggregation).
© Hypatia.Tech. 2024 All rights reserved.