Explore tens of thousands of sets crafted by our community.
Machine Learning Security and Privacy
10
Flashcards
0/10
Fairness and Bias
Fairness and bias issues arise when a model's decisions disproportionately affect certain groups. Mitigation includes auditing datasets for bias, using fairness metrics during model evaluation, and applying fairness-aware algorithms.
Replay Attacks
Replay attacks use previously captured data to fool a system. Preventing replay attacks involves using anomaly detection, timestamping inputs, and ensuring contextual relevance of the data.
Data Poisoning
Data poisoning involves injecting malicious data into a dataset, leading to compromised model performance. Mitigation strategies include robust data validation, anomaly detection, and regular model retraining with trusted data sources.
Information Leakage
Information leakage occurs when a model unintentionally exposes sensitive information. This can be mitigated by reducing model complexity, applying data anonymization, and ensuring proper access controls.
Membership Inference Attacks
Membership inference attacks determine if a specific data point was used in the model's training set. To prevent this, use techniques like data generalization, noise addition, and regularizing models to reduce overfitting.
Evasion Attacks (Adversarial Examples)
Evasion attacks, or adversarial examples, involve subtly altering inputs to mislead models. Defenses include training on adversarial examples, using model ensembles, and deploying adversarial detection systems.
Backdoor Attacks
Backdoor attacks embed hidden behavior in a model during training, which can be activated by trigger inputs. Countermeasures include thorough validation of training data, model inspection, and anomaly detection to identify triggers.
Model Stealing (Extraction)
Model stealing is the unauthorized replication of a model by probing the system with inputs and observing outputs. Mitigation includes API rate limiting, monitoring for suspicious activity, and employing watermarking techniques.
Model Inversion Attacks
Model inversion attacks aim to recover sensitive information from a model's output. Protecting against this involves limiting model access, applying differential privacy techniques, and reducing model overfitting.
Trojan Attacks
Trojan attacks involve altering a model to respond to certain inputs with a misclassification. Defense strategies include inspecting and cleaning the training dataset, implementing strong model validation, and avoiding use of untrusted pre-trained models.
© Hypatia.Tech. 2024 All rights reserved.