Explore tens of thousands of sets crafted by our community.
Data Imbalance Handling
8
Flashcards
0/8
Cost-sensitive Learning
This technique assigns a higher cost to misclassifying the minority class. It's used to make the classifier more mindful of the minority class by penalizing the misclassifications of that class more heavily.
Undersampling the Majority Class
This technique reduces the number of instances from the majority class to balance the dataset. It is used to prevent the classifier from being overwhelmed by the majority class, potentially improving performance on minority classes.
Anomaly Detection Techniques
These techniques treat the minority class as anomalies in the data. It is used when the minority class is not just rare but is also considered to be an anomaly, and this approach seeks to identify these rare events.
Ensemble Methods
Ensemble methods like Random Forest and Boosting can be used to handle imbalanced data by aggregating multiple classifiers. They're used to generate more robust predictions and can be adjusted to give more importance to the minority class.
Cluster-based Over Sampling
This technique involves clustering the minority class and then over sampling within each cluster. It helps in maintaining the information richness of the minority class and avoids generating overlapping samples.
Using Different Evaluation Metrics
Instead of accuracy, other metrics like Precision, Recall, F1 Score, ROC AUC are used. These metrics provide a more nuanced view of the model's performance, particularly for classes of different sizes.
Synthetic Minority Over-sampling Technique (SMOTE)
SMOTE generates synthetic examples for the minority class using nearest neighbors. It's used to create a more balanced class distribution and it helps to avoid overfitting compared to random oversampling.
Oversampling the Minority Class
This involves adding more copies of the minority class to the dataset. It is used to augment the minority class by replicating its examples, which can help improve the classifier's performance on that class.
© Hypatia.Tech. 2024 All rights reserved.