Explore tens of thousands of sets crafted by our community.
Machine Learning Pipelines
12
Flashcards
0/12
Data Collection
The process of gathering data from various sources which will be used for training the machine learning model. This stage sets the foundation for the quality and quantity of the data that will be processed and analyzed.
Data Cleaning
Involves preprocessing data to remove inaccuracies, duplicate values, and irrelevant information. Data cleaning ensures the model learns from high-quality, relevant data.
Model Evaluation
Assessing the model's performance using various metrics, like accuracy, precision, recall, and F1 score, to understand how well the model predicts on the validation set and to tune it before real-world deployment.
Model Selection
The process where different machine learning algorithms are considered and evaluated to choose the best one based on performance metrics for the specific problem at hand.
Feature Engineering
The practice of selecting, modifying, or creating relevant features from raw data to enhance the performance of machine learning models. This step can greatly influence the model's predictive power.
Model Training
The phase in which a machine learning model learns from the training data by adjusting its parameters to minimize a loss function, with an aim of making accurate predictions.
Data Labeling
The process of identifying raw data and adding one or more meaningful and informative labels to provide context that a machine learning model can learn from, crucial for supervised learning tasks.
Data Splitting
Dividing the dataset into separate sets, typically including a training set, validation set, and test set, to train the model and evaluate its performance in an unbiased manner.
Model Retraining
The process of updating a machine learning model with new data to maintain its accuracy and to account for any changes in the underlying patterns since it was last trained.
Model Deployment
The stage where the trained model is integrated into a production environment to make real-time predictions or insights based on new data. Deployment can vary in complexity depending on the requirements.
Monitoring and Maintenance
Ongoing tracking of the model's performance in production to ensure it maintains high accuracy and relevance, with updates and adjustments made as necessary.
Hyperparameter Tuning
The process of optimizing the hyperparameters of a model, which are not learned during training, to improve performance. Techniques such as grid search or random search are often used.
© Hypatia.Tech. 2024 All rights reserved.