Explore tens of thousands of sets crafted by our community.
Data Mining Techniques
10
Flashcards
0/10
Classification
A technique that assigns each item in a dataset to one of a predefined set of categories or classes. Commonly used in spam filtering.
Support Vector Machines (SVM)
A supervised learning model with associated learning algorithms that analyze data for classification and regression. Works well for complex but small- or medium-sized datasets.
Decision Trees
A decision support tool that uses a tree-like graph or model of decisions and their possible consequences. Used in data mining for deriving a strategy to reach a goal.
Clustering
A method of grouping a set of objects in a way that objects in the same group (cluster) are more similar to each other than to those in other groups. Used in market segmentation.
Principal Component Analysis (PCA)
A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables.
Ensemble Methods
Techniques that create multiple models and then combine them to produce improved results. Used extensively in competitions like Kaggle.
Neural Networks
A set of algorithms modeled loosely after the human brain that are designed to recognize patterns. Used in image and speech recognition.
Association Rule Learning
Focuses on discovering interesting relationships between variables in large databases, like the 'market basket analysis'.
Regression
It attempts to model the relationship between two or more features by fitting a linear or non-linear equation to observed data. Used in forecasting and predicting continuous values.
Anomaly Detection
Identifies unusual patterns that do not conform to expected behavior. Commonly used in fraud detection.
© Hypatia.Tech. 2024 All rights reserved.