Discover published sets by community

Explore tens of thousands of sets crafted by our community.

Categories

Computer Science

Data Mining

Anomaly Detection Methods

Association Rule Learning

Big Data Technologies

Classification Algorithms

Clustering Algorithms

Data Dimensionality Reduction Techniques

Data Mining Project Lifecycle

Data Mining Success Metrics

Data Mining for Fraud Detection

Data Mining Techniques

Data Mining and Customer Relationship Management (CRM)

Data Mining in Finance

Data Mining in Retail

Data Quality Issues

Data Mining Tools and Software

Data Mining Case Studies

Data Mining Research Topics

Data Mining in Healthcare

Data Preprocessing Techniques

Data Mining Challenges

Data Mining Skill Set

Ensuring Data Privacy in Data Mining

Ethical Aspects of Data Mining

Frequent Pattern Mining Algorithms

Machine Learning vs Data Mining

Mining Complex Data Types

Multimedia Data Mining

Predictive Analytics Fundamentals

Time Series Analysis in Data Mining

Web Mining Methods

Spatial Data Mining

Social Media Data Mining

Text Mining Techniques

Clustering Algorithms

Flashcards

0/8

Still learning

K-Means Clustering

A partitioning method that divides data into non-overlapping subsets (clusters) by minimizing the variance within each cluster. Ideal for spherical cluster shapes and large datasets.

Hierarchical Clustering

Builds a tree of clusters by either successively merging or splitting clusters based on a distance measure. Ideal for discovering hierarchical relationships and when the number of clusters is not known.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise)

Clusters data based on density estimation and is robust to outliers. It can discover clusters of arbitrary shape and is ideal for spatial data with noise.

Mean Shift Clustering

A centroid-based algorithm that updates candidates for centroids to be the mean of the points within a given region. Ideal for complex cluster shapes and when the number of clusters is not known.

Spectral Clustering

Uses eigenvalues of a similarity matrix to reduce dimensionality before clustering in fewer dimensions. It's ideal for clustering non-convex clusters or when the graph representation of data is available.

Expectation-Maximization (EM) Clustering using Gaussian Mixture Models (GMM)

A probabilistic model that assumes data points are generated from a mixture of several Gaussian distributions. Ideal for soft-clustering and when there is a hidden, not observable parameter.

OPTICS (Ordering Points To Identify the Clustering Structure)

Similar to DBSCAN, but creates an ordered list of points representing a clustering structure. Ideal for data with varying density and when cluster separation is not clear.

Affinity Propagation

Exchanges messages between data points until a set of exemplars (cluster centers) emerges. Ideal for small to medium size datasets and when the number of clusters is not known.

Know

Still learning

Click to flip

Know