My Digital Garden

Cluster Analysis

Cluster Analysis

Cluster analysis is a type of unsupervised machine learning used to group similar data points together based on their features. Unlike classification, where labels are known, clustering is used when you want to uncover hidden patterns or groupings in unlabeled data. It's particularly useful for customer segmentation, anomaly detection, recommendation systems, and exploratory data analysis.

Why Use Clustering?

  • Discover structure: Find inherent groupings in your data.
  • Data simplification: Reduce complexity by categorizing data into clusters.
  • Feature engineering: Cluster assignments can be used as new features for supervised learning.
  • Anomaly detection: Outliers often don't belong to any cluster.

Cluster analysis algorithms

See also

internal and external references